Jay, Ewen had a good example:
1. 0.10.0 (protocol v4) is released with current KIP-35 2. On trunk, modify produce requests and bump to v5 3. On trunk, we modify metadata requests and bump to v6 4. Now we decide that the metadata change fixes a super critical issue and want to backport the change. What's the protocol version of the next release of 0.10.0 - which supports v6 protocol only partially? Gwen On Mon, Mar 14, 2016 at 6:38 PM, Jay Kreps <j...@confluent.io> wrote: > Hey Dana, > > I am actually thinking about it differently. Basically I think you are > imagining a world in which the Kafka code is the source of truth, and > the Kafka developers make random changes that inflict pain on you at > will. The protocol documentation is basically just some semi-accurate > description of what the code does. It sounds like this isn't too far > from the actual world. :-) In that world I agree that the best we > could do would be to assign some id to versions (the md5 of the Kafka > jar, maybe) and put in various checks around that in clients to try to > keep things working. > > But imagine a different approach where we try to really treat the > protocol as a document and treat that as the source of truth. We try > to make this document cover what is and isn't specified and make it > cover enough to support client implementations and a given Kafka > version covers some range of protocols explicitly. There is a version > of this document for each protocol version. The code implements the > protocol rather than vice versa. > > So in other words protocol changes are totally ordered and separate > from code development (we might develop them together but the protocol > assignment would come when you checked in the new protocol version > which would happen with your code). > > This was really the intention with the protocol originally (though we > were doing it on a per-api basis), but I think that understanding was > not shared by the full team and we have not done a great job of > important things like documentation or explaining how this are > supposed to work so we fall back on the "the protocol is whatever the > code does" thing. > > Does that make sense? In that sense think one of the more important > things we could get out of this would not be more versioning features > so much as clear docs and processes around protocol versioning. > > -Jay > > On Mon, Mar 14, 2016 at 6:22 PM, Dana Powers <dana.pow...@gmail.com> > wrote: > > Is a linear protocol int consistent with the current release model? It > > seems like that would break down w/ the multiple release branches that > are > > all simultaneously maintained? Or is it implicit that no patch release > can > > ever bump the protocol int? Or maybe the protocol int gets some extra > > "wiggle" on minor / major releases to create unallocated version ints > that > > could be used on future patch releases / backports? > > > > I think the protocol version int does make sense for folks deploying from > > trunk. > > > > -Dana > > > > On Mon, Mar 14, 2016 at 6:13 PM, Jay Kreps <j...@confluent.io> wrote: > > > >> Yeah I think that is the point--we have a proposal for a new protocol > >> versioning scheme and a vote on it but it doesn't actually describe > >> how versioning will work yet! I gave my vague impression based on this > >> thread, but I want to make sure that is correct and get it written > >> down before we adopt it. > >> > >> -Jay > >> > >> On Mon, Mar 14, 2016 at 5:58 PM, Gwen Shapira <g...@confluent.io> > wrote: > >> > On Mon, Mar 14, 2016 at 5:31 PM, Jay Kreps <j...@confluent.io> wrote: > >> > > >> >> Couple of missing things: > >> >> > >> >> This KIP doesn't have a proposal on versioning it just gives > different > >> >> options, it'd be good to get a concrete proposal in the KIP. Here is > my > >> >> understanding of what we are proposing (can someone sanity check and > if > >> >> correct, update the kip): > >> >> > >> >> 1. We will augment the existing api_version field in the header > with > >> a > >> >> protocol_version that will begin at some initial value and > increment > >> by > >> >> 1 > >> >> every time we make a changes to any of the api_versions (question: > >> >> including internal apis?). > >> >> > >> > > >> > Jay, this part was not in the KIP and was never discussed. > >> > Are you proposing adding this? Or is it just an assumption you made? > >> > > >> > > >> > > >> >> 2. The protocol_version will be added to the metadata request > >> >> 3. We will also add a string that this proposal is calling > >> VersionString > >> >> which will describe the build of kafka in some way. The clients > >> should > >> >> not > >> >> under any circumstances do anything with this string other than > >> print it > >> >> out to the user. > >> >> > >> >> One thing I'm not sure about: I think currently metadata sits in the > >> client > >> >> for 10 mins by default. Say a client bootstraps and then a server is > >> >> downgraded to an earlier version, won't the client's metadata version > >> >> indicate that that client handles a version it doesn't actually > handle > >> any > >> >> more? We need to document how clients will handle this. > >> >> > >> >> Here are some comments on other details: > >> >> > >> >> 1. As a minor thing I think we should avoid naming the fields > >> VersionId > >> >> and VersionString which sort of implies they are both used for > >> >> versioning. > >> >> I think we should call them something like ProtocolVersion and > >> >> BuildDescription, with BuildDescription being totally unspecified > >> other > >> >> than that it is some kind of human readable string describing a > >> >> particular > >> >> Kafka build. We really don't want a client attempting to use this > >> >> string in > >> >> any way as that would always be the wrong thing to do in the > >> versioning > >> >> scheme we are proposing, you should always use the protocol > version. > >> >> 2. Does making the topics field in the metadata request nullable > >> >> actually make sense? We have a history of wanting to add magical > >> values > >> >> rather than fields. Currently topics=[a] means give me information > >> about > >> >> topic a, topics=[] means give me information about all topics, > and we > >> >> are > >> >> proposing topics=null would mean don't give me topics. I don't > have a > >> >> strong opinion. > >> >> 3. I prefer Jason's proposal on using a conservative metadata > version > >> >> versus the empty response hack. However I think that may actually > >> >> exacerbate the downgrade scenario I described. > >> >> 4. I agree with Jason that we should really look at the details of > >> the > >> >> implementation so we know it works--implementing server support > >> without > >> >> actually trying it is kind of risky. > >> >> > >> >> As a meta comment: I'd really like to encourage us to think of the > >> protocol > >> >> as a document that includes the following things: > >> >> > >> >> - The binary format, error codes, etc > >> >> - The request/response interaction > >> >> - The semantics of each request in different cases > >> >> - Instructions on how to use this to implement a client > >> >> > >> >> This document is versioned with the protocol number and is the > source of > >> >> truth for the protocol. > >> >> > >> >> Part of any protocol change needs to be an update to the > instructions on > >> >> how to use that part of the protocol. We should be opinionated. If > there > >> >> are two options there should be a reason, and then we need to > document > >> both > >> >> and say exactly when to use each. > >> >> > >> >> I think we also need to get a "how to" document on protocol changes > >> just so > >> >> people know what they need to do to add a new protocol feature. > >> >> > >> >> -Jay > >> >> > >> >> On Mon, Mar 14, 2016 at 4:51 PM, Jason Gustafson <ja...@confluent.io > > > >> >> wrote: > >> >> > >> >> > > > >> >> > > Are you suggesting this as a solution for the problem where a > KIP-35 > >> >> > aware > >> >> > > client sends a higher version of metadata req, say v2, to a > KIP-35 > >> >> aware > >> >> > > broker that only supports up to v1. > >> >> > > >> >> > > >> >> > Yes, that's right. In that case, the client first sends v1, finds > out > >> >> that > >> >> > the broker supports v2, and then sends v2 (if it has any reason to > do > >> >> so). > >> >> > > >> >> > We had a bit of discussion on such scenarios, and it seemed to be a > >> >> chicken > >> >> > > and egg problem that is hard to avoid. Your suggestion definitely > >> makes > >> >> > > sense, however it falls under the purview of clients. > >> >> > > >> >> > > >> >> > That basically means clients should figure it out for themselves? > >> Might > >> >> be > >> >> > nice to have a better answer. > >> >> > > >> >> > KIP-35 only aims on adding support for getting the version info > from a > >> >> > > broker. This definitely can be utilized by our clients. However, > >> that > >> >> can > >> >> > > follow KIP-35 changes. Does this sound reasonable to you? > >> >> > > >> >> > > >> >> > It may be OK, but I'm a little concerned about offering a feature > >> that we > >> >> > don't support ourselves. Sometimes it's not until implementation > that > >> we > >> >> > find out whether it really works as expected. And if we're > eventually > >> >> > planning to support it, I feel we should think through some of the > >> cases > >> >> a > >> >> > bit more. For example, the upgrade and downgrade cases that Becket > >> >> > mentioned earlier. It doesn't feel too great to support this > feature > >> >> unless > >> >> > we can offer guidance on how to use it. > >> >> > > >> >> > -Jason > >> >> > > >> >> > > >> >> > > >> >> > On Mon, Mar 14, 2016 at 4:20 PM, Ashish Singh <asi...@cloudera.com > > > >> >> wrote: > >> >> > > >> >> > > Hi Jason, > >> >> > > > >> >> > > On Mon, Mar 14, 2016 at 4:04 PM, Jason Gustafson < > >> ja...@confluent.io> > >> >> > > wrote: > >> >> > > > >> >> > > > Perhaps clients should always send the oldest version of the > >> metadata > >> >> > > > request which supports KIP-35 when initially connecting to the > >> >> cluster. > >> >> > > > Depending on the versions in the response, it can upgrade to a > >> more > >> >> > > recent > >> >> > > > version. Then maybe we don't need the empty response hack? > >> >> > > > > >> >> > > Are you suggesting this as a solution for the problem where a > KIP-35 > >> >> > aware > >> >> > > client sends a higher version of metadata req, say v2, to a > KIP-35 > >> >> aware > >> >> > > broker that only supports up to v1. > >> >> > > > >> >> > > We had a bit of discussion on such scenarios, and it seemed to > be a > >> >> > chicken > >> >> > > and egg problem that is hard to avoid. Your suggestion definitely > >> makes > >> >> > > sense, however it falls under the purview of clients. > >> >> > > > >> >> > > > > >> >> > > > One thing that's not clear to me is whether the ultimate goal > of > >> this > >> >> > KIP > >> >> > > > is to have our clients support multiple broker versions. It > would > >> be > >> >> a > >> >> > > > little weird to have this feature if our own clients don't use > it. > >> >> > > > > >> >> > > KIP-35 only aims on adding support for getting the version info > >> from a > >> >> > > broker. This definitely can be utilized by our clients. However, > >> that > >> >> can > >> >> > > follow KIP-35 changes. Does this sound reasonable to you? > >> >> > > > >> >> > > > > >> >> > > > -Jason > >> >> > > > > >> >> > > > On Mon, Mar 14, 2016 at 3:34 PM, Ashish Singh < > >> asi...@cloudera.com> > >> >> > > wrote: > >> >> > > > > >> >> > > > > On Mon, Mar 14, 2016 at 2:12 PM, Ismael Juma < > ism...@juma.me.uk > >> > > >> >> > > wrote: > >> >> > > > > > >> >> > > > > > On Mon, Mar 14, 2016 at 8:45 PM, Gwen Shapira < > >> g...@confluent.io > >> >> > > >> >> > > > wrote: > >> >> > > > > > > > >> >> > > > > > > > I don't see how it helps. If the client is > communicating > >> >> with a > >> >> > > > > broker > >> >> > > > > > > that > >> >> > > > > > > > does not support KIP-35, that broker will simply close > the > >> >> > > > > connection. > >> >> > > > > > If > >> >> > > > > > > > the broker supports KIP-35, then it will provide the > >> broker > >> >> > > > version. > >> >> > > > > I > >> >> > > > > > > > don't envisage a scenario where a broker does not > support > >> >> > KIP-35, > >> >> > > > but > >> >> > > > > > > > implements the new behaviour of sending an empty > >> response. Do > >> >> > > you? > >> >> > > > > > > > > >> >> > > > > > > > Are you sure about that? Per KIP-35, the broker > supplies > >> the > >> >> > > > version > >> >> > > > > in > >> >> > > > > > > response to Metadata request, not in response to anything > >> else. > >> >> > > > > > > If the client sends producer request version 42 > >> (accidentally > >> >> or > >> >> > > due > >> >> > > > to > >> >> > > > > > > premature upgrade) to KIP-35-compactible broker - we > want to > >> >> see > >> >> > an > >> >> > > > > empty > >> >> > > > > > > packet and not a connection close. > >> >> > > > > > > Sending a broker version was deemed impractical IIRC. > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > OK, so this is a different case than the one Ashish > described > >> >> > > ("client > >> >> > > > > that > >> >> > > > > > wants to support broker versions that do not provide broker > >> >> version > >> >> > > in > >> >> > > > > > metadata and broker versions that provides version info in > >> >> > > metadata"). > >> >> > > > > So, > >> >> > > > > > you are suggesting that if a client is communicating with a > >> >> broker > >> >> > > that > >> >> > > > > > implements KIP-35 and it receives an empty response, it > will > >> >> assume > >> >> > > > that > >> >> > > > > > the broker doesn't support the request version and it won't > >> try > >> >> to > >> >> > > > parse > >> >> > > > > > the response? I think it would be good to explain this > kind of > >> >> > thing > >> >> > > in > >> >> > > > > > detail in the KIP. > >> >> > > > > > > >> >> > > > > Actually even in this case and the case I mentioned, closing > >> >> > connection > >> >> > > > > should be fine. Lets think about possible reasons that could > >> lead > >> >> to > >> >> > > this > >> >> > > > > issue. > >> >> > > > > > >> >> > > > > 1. Client has incorrect mapping of supported protocols for a > >> broker > >> >> > > > > version. > >> >> > > > > 2. Client misread broker version from metadata response. > >> >> > > > > 3. Client constructed unsupported protocol version by > mistake. > >> >> > > > > > >> >> > > > > In all the above cases irrespective of what broker does, > client > >> >> will > >> >> > > keep > >> >> > > > > sending wrong request version. > >> >> > > > > > >> >> > > > > At this point, I think sending an empty packet instead of > >> closing > >> >> > > > > connection is a nice to have and not mandatory requirement. > >> Like in > >> >> > the > >> >> > > > > above case, a client can catch parsing error and be sure that > >> there > >> >> > is > >> >> > > > > something wrong in the protocol version it is sending. > However, > >> a > >> >> > > generic > >> >> > > > > connection close does not really provide any information on > >> >> probable > >> >> > > > cause. > >> >> > > > > > >> >> > > > > What do you guys suggest? > >> >> > > > > > >> >> > > > > > > >> >> > > > > > Ismael > >> >> > > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > -- > >> >> > > > > > >> >> > > > > Regards, > >> >> > > > > Ashish > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > -- > >> >> > > > >> >> > > Regards, > >> >> > > Ashish > >> >> > > > >> >> > > >> >> > >> >