Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

Jun Rao Mon, 19 Jan 2015 09:38:28 -0800

For 2, how about we make a change to log a warning for ack > 1 in 0.8.2 and
then drop the ack > 1 support in trunk (w/o bumping up the protocol
version)? Thanks,


Jun

On Sun, Jan 18, 2015 at 8:24 PM, Gwen Shapira <[email protected]> wrote:

> Overall, agree on point #1, less sure on point #2.
>
> 1. Some protocols never ever add new errors, while others add errors
> without bumping versions. HTTP is a good example of the second type.
> HTTP-451 was added fairly recently, there are some errors specific to
> NGINX, etc. No one cares. I think we should properly document in the
> wire-protocol doc that new errors can be added, and I think we should
> strongly suggest (and implement ourselves) that unknown error codes
> should be shown to users (or at least logged), so they can be googled
> and understood through our documentation.
> In addition, hierarchy of error codes, so clients will know if an
> error is retry-able just by looking at the code could be nice. Same
> for adding an error string to the protocol. These are future
> enhancements that should be discussed separately.
>
> 2. I think we want to allow admins to upgrade their Kafka brokers
> without having to chase down clients in their organization and without
> getting blamed if clients break. I think it makes sense to have one
> version that will support existing behavior, but log warnings, so
> admins will know about misbehaving clients and can track them down
> before an upgrade that breaks them (or before the broken config causes
> them to lose data!). Hopefully this is indeed a very rare behavior and
> we are taking extra precaution for nothing, but I have customers where
> one traumatic upgrade means they will never upgrade a Kafka again, so
> I'm being conservative.
>
> Gwen
>
>
> On Sun, Jan 18, 2015 at 3:50 PM, Jun Rao <[email protected]> wrote:
> > Overall, I agree with Jay on both points.
> >
> > 1. I think it's reasonable to add new error codes w/o bumping up the
> > protocol version. In most cases, by adding new error codes, we are just
> > refining the categorization of those unknown errors. So, a client
> shouldn't
> > behave worse than before as long as unknown errors have been properly
> > handled.
> >
> > 2. I think it's reasonable to just document that 0.8.2 will be the last
> > release that will support ack > 1 and remove the support completely in
> trunk
> > w/o bumping up the protocol. This is because (a) we never included ack >
> 1
> > explicitly in the documentation and so the usage should be limited; (2)
> ack
> >> 1 doesn't provide the guarantee that people really want and so it
> > shouldn't really be used.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Sun, Jan 18, 2015 at 11:03 AM, Jay Kreps <[email protected]> wrote:
> >>
> >> Hey guys,
> >>
> >> I really think we are discussing two things here:
> >>
> >> How should we generally handle changes to the set of errors? Should
> >> introducing new errors be considered a protocol change or should we
> reserve
> >> the right to introduce new error codes?
> >> Given that this particular change is possibly incompatible, how should
> we
> >> handle it?
> >>
> >> I think it would be good for people who are responding here to be
> specific
> >> about which they are addressing.
> >>
> >> Here is what I think:
> >>
> >> 1. Errors should be extensible within a protocol version.
> >>
> >> We should change the protocol documentation to list the errors that can
> be
> >> given back from each api, their meaning, and how to handle them, BUT we
> >> should explicitly state that the set of errors are open ended. That is
> we
> >> should reserve the right to introduce new errors and explicitly state
> that
> >> clients need a blanket "unknown error" handling mechanism. The error can
> >> link to the protocol definition (something like "Unknown error 42, see
> >> protocol definition at http://link";). We could make this work really
> well by
> >> instructing all the clients to report the error in a very googlable way
> as
> >> Oracle does with their error format (e.g. "ORA-32") so that if you ever
> get
> >> the raw error google will take you to the definition.
> >>
> >> I agree that a more rigid definition seems like "right thing", but
> having
> >> just implemented two clients and spent a bunch of time on the server
> side, I
> >> think, it will work out poorly in practice. Here is why:
> >>
> >> I think we will make a lot of mistakes in nailing down the set of error
> >> codes up front and we will end up going through 3-4 churns of the
> protocol
> >> definition just realizing the set of errors that can be thrown. I think
> this
> >> churn will actually make life worse for clients that now have to figure
> out
> >> 7 identical versions of the protocol and will be a mess in terms of
> testing
> >> on the server side. I actually know this to be true because while
> >> implementing the clients I tried to guess the errors that could be
> thrown,
> >> then checked my guess by close code inspection. It turned out that I
> always
> >> missed things in my belief about errors, but more importantly even after
> >> close code inspection I found tons of other errors in my stress testing.
> >> In practice error handling always involves calling out one or two
> >> meaningful failures that have special recovery and then a blanket case
> that
> >> just handles everything else. It's true that some clients may not have
> done
> >> this well, but I think it is for the best if they fix that.
> >> Reserving the right to add errors doesn't mean we will do it without
> care.
> >> We will think through each change and decide whether giving a little
> more
> >> precision in the error is worth the overhead and churn of a protocol
> version
> >> bump.
> >>
> >> 2. In this case in particular we should not introduce a new protocol
> >> version
> >>
> >> In this particular case we are saying that acks > 1 doesn't make sense
> and
> >> we want to give an error to people specifying this so that they change
> their
> >> configuration. This is a configuration that few people use and we want
> to
> >> just make it an error. The bad behavior will just be that the error
> will not
> >> be as good as it could be. I think that is a better tradeoff than
> >> introducing a separate protocol version (this may be true of the java
> >> clients too).
> >>
> >> We will have lots of cases like this in the future and we aren't going
> to
> >> want to churn the protocol for each of them. For example we previously
> had
> >> to get more precise about which characters were legal and which illegal
> in
> >> topic names.
> >>
> >> -Jay
> >>
> >> On Fri, Jan 16, 2015 at 11:55 AM, Gwen Shapira <[email protected]>
> >> wrote:
> >>>
> >>> I updated the KIP: Using acks > 1 in version 0 will log a WARN message
> >>> in the broker about client using deprecated behavior (suggested by Joe
> >>> in the JIRA, and I think it makes sense).
> >>>
> >>> Gwen
> >>>
> >>> On Fri, Jan 16, 2015 at 10:40 AM, Gwen Shapira <[email protected]>
> >>> wrote:
> >>> > How about we continue the discussion on this thread, so we won't lose
> >>> > the context of this discussion, and put it up for VOTE when this has
> >>> > been finalized?
> >>> >
> >>> > On Fri, Jan 16, 2015 at 10:22 AM, Neha Narkhede <[email protected]>
> >>> > wrote:
> >>> >> Gwen,
> >>> >>
> >>> >> KIP write-up looks good. According to the rest of the KIP process
> >>> >> proposal,
> >>> >> would you like to start a DISCUSS/VOTE thread for it?
> >>> >>
> >>> >> Thanks,
> >>> >> Neha
> >>> >>
> >>> >> On Fri, Jan 16, 2015 at 9:37 AM, Ewen Cheslack-Postava
> >>> >> <[email protected]>
> >>> >> wrote:
> >>> >>
> >>> >>> Gwen -- KIP write up looks good. Deprecation schedule probably
> needs
> >>> >>> to be
> >>> >>> more specific, but I think that discussion probably needs to happen
> >>> >>> after a
> >>> >>> solution is agreed upon.
> >>> >>>
> >>> >>> Jay -- I think "older clients will get a bad error message instead
> of
> >>> >>> a
> >>> >>> good one" isn't what would be happening with this change.
> Previously
> >>> >>> they
> >>> >>> wouldn't have received an error and they would have been able to
> >>> >>> produce
> >>> >>> messages. After the change they'll just receive this new error
> >>> >>> message
> >>> >>> which their clients can't possibly handle gracefully since it
> didn't
> >>> >>> exist
> >>> >>> when the client was written. Whether the acks > 1 setting was
> >>> >>> actually
> >>> >>> accomplishing what they thought doesn't matter. Someone could have
> >>> >>> reasonably read the docs on 0.8.1.1, thought acks = 2 is an ok
> >>> >>> setting for
> >>> >>> their applications, set it as a default across a bunch of apps,
> then
> >>> >>> follow
> >>> >>> the recommended upgrade path of updating brokers to 0.8.2 and all
> >>> >>> their
> >>> >>> apps will start failing on produce requests.
> >>> >>>
> >>> >>>
> >>> >>> On Thu, Jan 15, 2015 at 8:27 PM, Jay Kreps <[email protected]>
> >>> >>> wrote:
> >>> >>>
> >>> >>> > This is a good case to discuss.
> >>> >>> >
> >>> >>> > Let's figure the general case of how we want to handle errors and
> >>> >>> > get
> >>> >>> that
> >>> >>> > documented in the protocol. The problem right now is that we give
> >>> >>> > no
> >>> >>> > guidance on this. I actually thought Gwen's suggestion made sense
> >>> >>> > on the
> >>> >>> > guidance we should have given which is that we will enumerate a
> set
> >>> >>> > of
> >>> >>> > errors and their meaning for each API but it is possible that
> other
> >>> >>> errors
> >>> >>> > will occur and they should be handled (maybe poorly) in the same
> >>> >>> > way
> >>> >>> > UNKNOWN_ERROR is handled which is our normal escape hatch for
> >>> >>> > things like
> >>> >>> > OOMException.
> >>> >>> >
> >>> >>> > I really do think we shouldn't be dogmatic here: In considering a
> >>> >>> > change
> >>> >>> > to errors we should consider the potential ill-effect vs the
> >>> >>> > complexity
> >>> >>> of
> >>> >>> > yet another protocol version.
> >>> >>> >
> >>> >>> > In this case I actually am not sure we need to bump the protocol
> >>> >>> > because
> >>> >>> > the whole point of the change was to make a setting we think
> >>> >>> > doesn't make
> >>> >>> > sense break, right? Well this will break it. It seems like the
> only
> >>> >>> > downside is that older clients will get a bad error message
> instead
> >>> >>> > of a
> >>> >>> > good one. But it isn't like we will have rendered a client
> >>> >>> > unusable, it
> >>> >>> is
> >>> >>> > just that they will need to change their config.
> >>> >>> >
> >>> >>> > -Jay
> >>> >>> >
> >>> >>> > On Thu, Jan 15, 2015 at 6:14 PM, Gwen Shapira
> >>> >>> > <[email protected]>
> >>> >>> > wrote:
> >>> >>> >
> >>> >>> >> I created a KIP for this suggestion:
> >>> >>> >>
> >>> >>> >>
> >>> >>> >>
> >>> >>>
> >>> >>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks
> >>> >>> >>
> >>> >>> >> Basically documenting what was already discussed here.  Comments
> >>> >>> >> will
> >>> >>> >> be awesome!
> >>> >>> >>
> >>> >>> >> Gwen
> >>> >>> >>
> >>> >>> >> On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira
> >>> >>> >> <[email protected]>
> >>> >>> >> wrote:
> >>> >>> >> > The errors are part of the KIP process now, so I think the
> >>> >>> >> > clients are
> >>> >>> >> safe :)
> >>> >>> >> >
> >>> >>> >> >
> >>> >>> >>
> >>> >>>
> >>> >>>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> >>> >>> >> >
> >>> >>> >> > On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin
> >>> >>> >> > <[email protected]>
> >>> >>> >> wrote:
> >>> >>> >> >> Agree errors should be part of the protocol
> >>> >>> >> >>
> >>> >>> >> >>> On Jan 15, 2015, at 17:59, Gwen Shapira
> >>> >>> >> >>> <[email protected]>
> >>> >>> >> wrote:
> >>> >>> >> >>>
> >>> >>> >> >>> Hi,
> >>> >>> >> >>>
> >>> >>> >> >>> I got convinced by Joe and Dana that errors are indeed part
> of
> >>> >>> >> >>> the
> >>> >>> >> >>> protocol and can't be randomly added.
> >>> >>> >> >>>
> >>> >>> >> >>> So, it looks like we need to bump version of ProduceRequest
> in
> >>> >>> >> >>> the
> >>> >>> >> >>> following way:
> >>> >>> >> >>> Version 0 -> accept acks >1. I think we should keep the
> >>> >>> >> >>> existing
> >>> >>> >> >>> behavior too (i.e. not replace it with -1) to avoid
> surprising
> >>> >>> >> >>> clients, but I'm willing to hear other opinions.
> >>> >>> >> >>> Version 1 -> do not accept acks >1 and return an error.
> >>> >>> >> >>> Are we ok with the error I added in KAFKA-1697? We can use
> >>> >>> >> >>> something
> >>> >>> >> >>> less specific like InvalidRequestParameter. This error can
> be
> >>> >>> >> >>> reused
> >>> >>> >> >>> in the future and reduce the need to add errors, but will
> also
> >>> >>> >> >>> be
> >>> >>> less
> >>> >>> >> >>> clear to the client and its users. Maybe even add the error
> >>> >>> >> >>> message
> >>> >>> >> >>> string to the protocol in addition to the error code? (since
> >>> >>> >> >>> we are
> >>> >>> >> >>> bumping versions....)
> >>> >>> >> >>>
> >>> >>> >> >>> I think maintaining the old version throughout 0.8.X makes
> >>> >>> >> >>> sense.
> >>> >>> IMO
> >>> >>> >> >>> dropping it for 0.9 is feasible, but I'll let client owners
> >>> >>> >> >>> help
> >>> >>> make
> >>> >>> >> >>> that call.
> >>> >>> >> >>>
> >>> >>> >> >>> Am I missing anything? Should I start a KIP? It seems like a
> >>> >>> KIP-type
> >>> >>> >> >>> discussion :)
> >>> >>> >> >>>
> >>> >>> >> >>> Gwen
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
> >>> >>> >> >>> <[email protected]> wrote:
> >>> >>> >> >>>> Gwen,
> >>> >>> >> >>>>
> >>> >>> >> >>>> I think the only option that wouldn't require a protocol
> >>> >>> >> >>>> version
> >>> >>> >> change is
> >>> >>> >> >>>> the one where acks > 1 is converted to acks = -1 since it's
> >>> >>> >> >>>> the
> >>> >>> only
> >>> >>> >> one
> >>> >>> >> >>>> that doesn't potentially break older clients. The protocol
> >>> >>> >> >>>> guide
> >>> >>> >> says that
> >>> >>> >> >>>> the expected upgrade path is servers first, then clients,
> so
> >>> >>> >> >>>> old
> >>> >>> >> clients,
> >>> >>> >> >>>> including non-java clients, that may be using acks > 1
> should
> >>> >>> >> >>>> be
> >>> >>> >> able to
> >>> >>> >> >>>> work with a new broker version.
> >>> >>> >> >>>>
> >>> >>> >> >>>> It's more work, but I think dealing with the protocol
> change
> >>> >>> >> >>>> is the
> >>> >>> >> right
> >>> >>> >> >>>> thing to do since it eventually gets us to the behavior I
> >>> >>> >> >>>> think is
> >>> >>> >> better --
> >>> >>> >> >>>> the broker should reject requests with invalid values. I
> >>> >>> >> >>>> think Joe
> >>> >>> >> and I
> >>> >>> >> >>>> were basically in agreement. In my mind the major piece
> >>> >>> >> >>>> missing
> >>> >>> from
> >>> >>> >> his
> >>> >>> >> >>>> description is how long we're going to maintain his "case
> 0"
> >>> >>> >> behavior. It's
> >>> >>> >> >>>> impractical to maintain old versions forever, but it sounds
> >>> >>> >> >>>> like
> >>> >>> >> there
> >>> >>> >> >>>> hasn't been a decision on how long to maintain them. Maybe
> >>> >>> >> >>>> that's
> >>> >>> >> another
> >>> >>> >> >>>> item to add to KIPs -- protocol versions and behavior need
> to
> >>> >>> >> >>>> be
> >>> >>> >> listed as
> >>> >>> >> >>>> deprecated and the earliest version in which they'll be
> >>> >>> >> >>>> removed
> >>> >>> >> should be
> >>> >>> >> >>>> specified so users can understand which versions are
> >>> >>> >> >>>> guaranteed to
> >>> >>> be
> >>> >>> >> >>>> compatible, even if they're using (well-written) non-java
> >>> >>> >> >>>> clients.
> >>> >>> >> >>>>
> >>> >>> >> >>>> -Ewen
> >>> >>> >> >>>>
> >>> >>> >> >>>>
> >>> >>> >> >>>>> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers <
> >>> >>> >> [email protected]> wrote:
> >>> >>> >> >>>>>
> >>> >>> >> >>>>>> clients don't break on unknown errors
> >>> >>> >> >>>>>
> >>> >>> >> >>>>> maybe true for the official java clients, but I dont think
> >>> >>> >> >>>>> the
> >>> >>> >> assumption
> >>> >>> >> >>>>> holds true for community-maintained clients and users of
> >>> >>> >> >>>>> those
> >>> >>> >> clients.
> >>> >>> >> >>>>> kafka-python generally follows the fail-fast philosophy
> and
> >>> >>> >> >>>>> raises
> >>> >>> >> an
> >>> >>> >> >>>>> exception on any unrecognized error code in any server
> >>> >>> >> >>>>> response.
> >>> >>> >> in this
> >>> >>> >> >>>>> case, kafka-python allows users to set their own
> >>> >>> >> >>>>> required-acks
> >>> >>> >> policy when
> >>> >>> >> >>>>> creating a producer instance.  It is possible that users
> of
> >>> >>> >> kafka-python
> >>> >>> >> >>>>> have deployed producer code that uses ack>1 -- perhaps in
> >>> >>> production
> >>> >>> >> >>>>> environments -- and for those users the new error code
> will
> >>> >>> >> >>>>> crash
> >>> >>> >> their
> >>> >>> >> >>>>> producer code.  I would not be surprised if the same were
> >>> >>> >> >>>>> true of
> >>> >>> >> other
> >>> >>> >> >>>>> community clients.
> >>> >>> >> >>>>>
> >>> >>> >> >>>>> *one reason for the fail-fast approach is that there isn't
> >>> >>> >> >>>>> great
> >>> >>> >> >>>>> documentation on what errors to expect for each request /
> >>> >>> >> >>>>> response
> >>> >>> >> -- so
> >>> >>> >> >>>>> we
> >>> >>> >> >>>>> use failures to alert that some error case is not handled
> >>> >>> >> properly.  and
> >>> >>> >> >>>>> because of that, introducing new error cases without
> bumping
> >>> >>> >> >>>>> the
> >>> >>> api
> >>> >>> >> >>>>> version is likely to cause those errors to get
> raised/thrown
> >>> >>> >> >>>>> all
> >>> >>> >> the way
> >>> >>> >> >>>>> back up to the user.  of course we (client maintainers)
> can
> >>> >>> >> >>>>> fix
> >>> >>> the
> >>> >>> >> issues
> >>> >>> >> >>>>> in the client libraries and suggest users upgrade, but
> it's
> >>> >>> >> >>>>> not
> >>> >>> the
> >>> >>> >> ideal
> >>> >>> >> >>>>> situation.
> >>> >>> >> >>>>>
> >>> >>> >> >>>>>
> >>> >>> >> >>>>> long-winded way of saying: I agree w/ Joe.
> >>> >>> >> >>>>>
> >>> >>> >> >>>>> -Dana
> >>> >>> >> >>>>>
> >>> >>> >> >>>>>
> >>> >>> >> >>>>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira <
> >>> >>> >> [email protected]>
> >>> >>> >> >>>>> wrote:
> >>> >>> >> >>>>>
> >>> >>> >> >>>>>> Is the protocol bump caused by the behavior change or the
> >>> >>> >> >>>>>> new
> >>> >>> error
> >>> >>> >> >>>>>> code?
> >>> >>> >> >>>>>>
> >>> >>> >> >>>>>> 1) IMO, error_codes are data, and clients can expect to
> >>> >>> >> >>>>>> receive
> >>> >>> >> errors
> >>> >>> >> >>>>>> that they don't understand (i.e. unknown errors). AFAIK,
> >>> >>> >> >>>>>> clients
> >>> >>> >> don't
> >>> >>> >> >>>>>> break on unknown errors, they are simple more challenging
> >>> >>> >> >>>>>> to
> >>> >>> >> debug. If
> >>> >>> >> >>>>>> we document the new behavior, then its definitely
> >>> >>> >> >>>>>> debuggable and
> >>> >>> >> >>>>>> fixable.
> >>> >>> >> >>>>>>
> >>> >>> >> >>>>>> 2) The behavior change is basically a deprecation - i.e.
> >>> >>> >> >>>>>> acks > 1
> >>> >>> >> were
> >>> >>> >> >>>>>> never documented, and are not supported by Kafka clients
> >>> >>> >> >>>>>> starting
> >>> >>> >> with
> >>> >>> >> >>>>>> version 0.8.2. I'm not sure this requires a protocol bump
> >>> >>> >> >>>>>> either,
> >>> >>> >> >>>>>> although its a better case than new error codes.
> >>> >>> >> >>>>>>
> >>> >>> >> >>>>>> Thanks,
> >>> >>> >> >>>>>> Gwen
> >>> >>> >> >>>>>>
> >>> >>> >> >>>>>> On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein <
> >>> >>> [email protected]>
> >>> >>> >> >>>>>> wrote:
> >>> >>> >> >>>>>>> Looping in the mailing list that the client developers
> >>> >>> >> >>>>>>> live on
> >>> >>> >> because
> >>> >>> >> >>>>>> they
> >>> >>> >> >>>>>>> are all not on dev (though they should be if they want
> to
> >>> >>> >> >>>>>>> be
> >>> >>> >> helping
> >>> >>> >> >>>>>>> to
> >>> >>> >> >>>>>>> build the best client libraries they can).
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> I whole hardily believe that we need to not break
> existing
> >>> >>> >> >>>>>>> functionality
> >>> >>> >> >>>>>> of
> >>> >>> >> >>>>>>> the client protocol, ever.
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> There are many reasons for this and we have other
> threads
> >>> >>> >> >>>>>>> on the
> >>> >>> >> >>>>>>> mailing
> >>> >>> >> >>>>>>> list where we are discussing that topic (no pun
> intended)
> >>> >>> >> >>>>>>> that I
> >>> >>> >> don't
> >>> >>> >> >>>>>> want
> >>> >>> >> >>>>>>> to re-hash here.
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> If we change wire protocol functionality OR the binary
> >>> >>> >> >>>>>>> format
> >>> >>> >> (either)
> >>> >>> >> >>>>>>> we
> >>> >>> >> >>>>>>> must bump version AND treat version as a feature flag
> with
> >>> >>> >> backward
> >>> >>> >> >>>>>>> compatibility support until it is deprecated for some
> time
> >>> >>> >> >>>>>>> for
> >>> >>> >> folks
> >>> >>> >> >>>>>>> to
> >>> >>> >> >>>>>> deal
> >>> >>> >> >>>>>>> with it.
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> match version = {
> >>> >>> >> >>>>>>> case 0: keepDoingWhatWeWereDoing()
> >>> >>> >> >>>>>>> case 1: doNewStuff()
> >>> >>> >> >>>>>>> case 2: doEvenMoreNewStuff()
> >>> >>> >> >>>>>>> }
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> has to be a practice we adopt imho ... I know feature
> >>> >>> >> >>>>>>> flags can
> >>> >>> be
> >>> >>> >> >>>>>> construed
> >>> >>> >> >>>>>>> as "messy code" but I am eager to hear another (better?
> >>> >>> >> different?)
> >>> >>> >> >>>>>> solution
> >>> >>> >> >>>>>>> to this.
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> If we don't do a feature flag like this specifically
> with
> >>> >>> >> >>>>>>> this
> >>> >>> >> change
> >>> >>> >> >>>>>> then
> >>> >>> >> >>>>>>> what happens is that someone upgrades their brokers
> with a
> >>> >>> rolling
> >>> >>> >> >>>>>> restart
> >>> >>> >> >>>>>>> in 0.8.3 and every single one of their producer requests
> >>> >>> >> >>>>>>> start
> >>> >>> to
> >>> >>> >> fail
> >>> >>> >> >>>>>> and
> >>> >>> >> >>>>>>> they have a major production outage. eeeek!!!!
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> I do 100% agree that > 1 makes no sense and we *REALLY*
> >>> >>> >> >>>>>>> need
> >>> >>> >> people to
> >>> >>> >> >>>>>> start
> >>> >>> >> >>>>>>> using 0,1,-1 but we need to-do that in a way that is
> going
> >>> >>> >> >>>>>>> to
> >>> >>> >> work for
> >>> >>> >> >>>>>>> everyone.
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> Old producers and consumers must keep working with new
> >>> >>> >> >>>>>>> brokers
> >>> >>> >> and if
> >>> >>> >> >>>>>>> we
> >>> >>> >> >>>>>> are
> >>> >>> >> >>>>>>> not going to support that then I am unclear what the use
> >>> >>> >> >>>>>>> of
> >>> >>> >> "version"
> >>> >>> >> >>>>>>> is
> >>> >>> >> >>>>>>> based on our original intentions of having it because of
> >>> >>> >> >>>>>>> the
> >>> >>> >> >>>>>>> 0.7=>-0.8.
> >>> >>> >> >>>>>> We
> >>> >>> >> >>>>>>> said no more breaking changes when we did that.
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> - Joe Stein
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava
> <
> >>> >>> >> >>>>>> [email protected]>
> >>> >>> >> >>>>>>> wrote:
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>> Right, so this looks like it could create an issue
> >>> >>> >> >>>>>>>> similar to
> >>> >>> >> what's
> >>> >>> >> >>>>>>>> currently being discussed in
> >>> >>> >> >>>>>>>> https://issues.apache.org/jira/browse/KAFKA-1649 where
> >>> >>> >> >>>>>>>> users
> >>> >>> >> now get
> >>> >>> >> >>>>>>>> errors
> >>> >>> >> >>>>>>>> under conditions when they previously wouldn't. Old
> >>> >>> >> >>>>>>>> clients
> >>> >>> won't
> >>> >>> >> >>>>>>>> even
> >>> >>> >> >>>>>>>> know
> >>> >>> >> >>>>>>>> about the error code, so besides failing they won't
> even
> >>> >>> >> >>>>>>>> be
> >>> >>> able
> >>> >>> >> to
> >>> >>> >> >>>>>>>> log
> >>> >>> >> >>>>>>>> any
> >>> >>> >> >>>>>>>> meaningful error messages.
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>> I think there are two options for compatibility:
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>> 1. An alternative change is to remove the ack > 1 code,
> >>> >>> >> >>>>>>>> but
> >>> >>> >> silently
> >>> >>> >> >>>>>>>> "upgrade" requests with acks > 1 to acks = -1. This
> isn't
> >>> >>> >> >>>>>>>> the
> >>> >>> >> same as
> >>> >>> >> >>>>>>>> other
> >>> >>> >> >>>>>>>> changes to behavior since the interaction between the
> >>> >>> >> >>>>>>>> client
> >>> >>> and
> >>> >>> >> >>>>>>>> server
> >>> >>> >> >>>>>>>> remains the same, no error codes change, etc. The
> client
> >>> >>> >> >>>>>>>> might
> >>> >>> >> just
> >>> >>> >> >>>>>>>> see
> >>> >>> >> >>>>>>>> some increased latency since the message might need to
> be
> >>> >>> >> replicated
> >>> >>> >> >>>>>>>> to
> >>> >>> >> >>>>>>>> more brokers than they requested.
> >>> >>> >> >>>>>>>> 2. Split this into two patches, one that bumps the
> >>> >>> >> >>>>>>>> protocol
> >>> >>> >> version
> >>> >>> >> >>>>>>>> on
> >>> >>> >> >>>>>>>> that
> >>> >>> >> >>>>>>>> message to include the new error code but maintains
> both
> >>> >>> >> >>>>>>>> old
> >>> >>> (now
> >>> >>> >> >>>>>>>> deprecated) and new behavior, then a second that would
> be
> >>> >>> >> applied in
> >>> >>> >> >>>>>>>> a
> >>> >>> >> >>>>>>>> later release that removes the old protocol + code for
> >>> >>> >> >>>>>>>> handling
> >>> >>> >> acks
> >>> >>> >> >>>>>> 1.
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>> 2 is probably the right thing to do. If we specify the
> >>> >>> >> >>>>>>>> release
> >>> >>> >> when
> >>> >>> >> >>>>>> we'll
> >>> >>> >> >>>>>>>> remove the deprecated protocol at the time of
> deprecation
> >>> >>> >> >>>>>>>> it
> >>> >>> >> makes
> >>> >>> >> >>>>>> things
> >>> >>> >> >>>>>>>> a
> >>> >>> >> >>>>>>>> lot easier for people writing non-java clients and
> could
> >>> >>> >> >>>>>>>> give
> >>> >>> >> users
> >>> >>> >> >>>>>> better
> >>> >>> >> >>>>>>>> predictability (e.g. if clients are at most 1 major
> >>> >>> >> >>>>>>>> release
> >>> >>> >> behind
> >>> >>> >> >>>>>>>> brokers,
> >>> >>> >> >>>>>>>> they'll remain compatible but possibly use deprecated
> >>> >>> features).
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira <
> >>> >>> >> [email protected]>
> >>> >>> >> >>>>>>>> wrote:
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>>> Hi Kafka Devs,
> >>> >>> >> >>>>>>>>>
> >>> >>> >> >>>>>>>>> We are working on KAFKA-1697 - remove code related to
> >>> >>> >> >>>>>>>>> ack>1 on
> >>> >>> >> the
> >>> >>> >> >>>>>>>>> broker. Per Neha's suggestion, I'd like to give
> everyone
> >>> >>> >> >>>>>>>>> a
> >>> >>> >> heads up
> >>> >>> >> >>>>>>>>> on
> >>> >>> >> >>>>>>>>> what these changes mean.
> >>> >>> >> >>>>>>>>>
> >>> >>> >> >>>>>>>>> Once this patch is included, any produce requests that
> >>> >>> >> >>>>>>>>> include
> >>> >>> >> >>>>>>>>> request.required.acks > 1 will result in an exception.
> >>> >>> >> >>>>>>>>> This
> >>> >>> >> will be
> >>> >>> >> >>>>>>>>> InvalidRequiredAcks in new versions (0.8.3 and up, I
> >>> >>> >> >>>>>>>>> assume)
> >>> >>> and
> >>> >>> >> >>>>>>>>> UnknownException in existing versions (sorry, but I
> >>> >>> >> >>>>>>>>> can't add
> >>> >>> >> error
> >>> >>> >> >>>>>>>>> codes retroactively).
> >>> >>> >> >>>>>>>>>
> >>> >>> >> >>>>>>>>> This behavior is already enforced by 0.8.2 producers
> >>> >>> >> >>>>>>>>> (sync and
> >>> >>> >> >>>>>>>>> new),
> >>> >>> >> >>>>>>>>> but we expect impact on users with older producers
> that
> >>> >>> >> >>>>>>>>> relied
> >>> >>> >> on
> >>> >>> >> >>>>>>>>> acks
> >>> >>> >> >>>>>>>>>> 1 and external clients (i.e python, go, etc).
> >>> >>> >> >>>>>>>>>
> >>> >>> >> >>>>>>>>> Users who relied on acks > 1 are expected to switch to
> >>> >>> >> >>>>>>>>> using
> >>> >>> >> acks =
> >>> >>> >> >>>>>>>>> -1
> >>> >>> >> >>>>>>>>> and a min.isr parameter than matches their user case.
> >>> >>> >> >>>>>>>>>
> >>> >>> >> >>>>>>>>> This change was discussed in the past in the context
> of
> >>> >>> >> KAFKA-1555
> >>> >>> >> >>>>>>>>> (min.isr), but let us know if you have any questions
> or
> >>> >>> concerns
> >>> >>> >> >>>>>>>>> regarding this change.
> >>> >>> >> >>>>>>>>>
> >>> >>> >> >>>>>>>>> Gwen
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>>
> >>> >>> >> >>>>>>>> --
> >>> >>> >> >>>>>>>> Thanks,
> >>> >>> >> >>>>>>>> Ewen
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> --
> >>> >>> >> >>>>>>> You received this message because you are subscribed to
> >>> >>> >> >>>>>>> the
> >>> >>> Google
> >>> >>> >> >>>>>>> Groups
> >>> >>> >> >>>>>>> "kafka-clients" group.
> >>> >>> >> >>>>>>> To unsubscribe from this group and stop receiving emails
> >>> >>> >> >>>>>>> from
> >>> >>> it,
> >>> >>> >> send
> >>> >>> >> >>>>>>> an
> >>> >>> >> >>>>>>> email to [email protected].
> >>> >>> >> >>>>>>> To post to this group, send email to
> >>> >>> >> [email protected].
> >>> >>> >> >>>>>>> Visit this group at
> >>> >>> http://groups.google.com/group/kafka-clients.
> >>> >>> >> >>>>>>> To view this discussion on the web visit
> >>> >>> >> >>>>>>
> >>> >>> >> >>>>>>
> >>> >>> >>
> >>> >>>
> >>> >>>
> https://groups.google.com/d/msgid/kafka-clients/CAA7ooCBtH2JjyQsArdx_%3DV25B4O1QJk0YvOu9U6kYt9sB4aqng%40mail.gmail.com
> >>> >>> >> >>>>>> .
> >>> >>> >> >>>>>>>
> >>> >>> >> >>>>>>> For more options, visit
> >>> >>> >> >>>>>>> https://groups.google.com/d/optout.
> >>> >>> >> >>>>>>
> >>> >>> >> >>>>>> --
> >>> >>> >> >>>>>> You received this message because you are subscribed to
> the
> >>> >>> Google
> >>> >>> >> >>>>>> Groups
> >>> >>> >> >>>>>> "kafka-clients" group.
> >>> >>> >> >>>>>> To unsubscribe from this group and stop receiving emails
> >>> >>> >> >>>>>> from it,
> >>> >>> >> send
> >>> >>> >> >>>>>> an
> >>> >>> >> >>>>>> email to [email protected].
> >>> >>> >> >>>>>> To post to this group, send email to
> >>> >>> >> [email protected].
> >>> >>> >> >>>>>> Visit this group at
> >>> >>> >> >>>>>> http://groups.google.com/group/kafka-clients
> >>> >>> .
> >>> >>> >> >>>>>> To view this discussion on the web visit
> >>> >>> >> >>>>>>
> >>> >>> >> >>>>>>
> >>> >>> >>
> >>> >>>
> >>> >>>
> https://groups.google.com/d/msgid/kafka-clients/CAHBV8WeUebxi%2B%2BSbjz8E9Yf4u4hkcPJ80Xsj0XTKcTac%3D%2B613A%40mail.gmail.com
> >>> >>> >> >>>>>> .
> >>> >>> >> >>>>>> For more options, visit
> https://groups.google.com/d/optout.
> >>> >>> >> >>>>
> >>> >>> >> >>>>
> >>> >>> >> >>>>
> >>> >>> >> >>>>
> >>> >>> >> >>>> --
> >>> >>> >> >>>> Thanks,
> >>> >>> >> >>>> Ewen
> >>> >>> >>
> >>> >>> >
> >>> >>> >
> >>> >>>
> >>> >>>
> >>> >>> --
> >>> >>> Thanks,
> >>> >>> Ewen
> >>> >>>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Thanks,
> >>> >> Neha
> >>
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "kafka-clients" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to [email protected].
> >> To post to this group, send email to [email protected].
> >> Visit this group at http://groups.google.com/group/kafka-clients.
> >> To view this discussion on the web visit
> >>
> https://groups.google.com/d/msgid/kafka-clients/CAOeJiJh17CYq%3D-qgPu9rnArsPW%3D7RL9AAW_h%3DrrXx0%2BKhhKgNQ%40mail.gmail.com
> .
> >>
> >> For more options, visit https://groups.google.com/d/optout.
> >
> >
>

Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

Reply via email to