Yeah this is a good point. I don't think we have really called that out in any way. I think ideally the hashing should be documented and should be an official part of the contract.
-Jay On Wed, Jan 14, 2015 at 6:15 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > (deviating a bit from the points in the last set of emails) - just > wanted to ask if there was any heads-up regarding the change in > default partitioning behavior in the new producer. Chris (cc'd) > encountered this while upgrading to the new producer - the new > producer uses a murmur hash by default (which I agree is the right > thing to do btw) and the old producer uses hashCode on the original > partitioning key object. This affects users that depend on the default > partitioning logic. Although the new producer allows you to partition > outside explicitly, users will need to be aware of this change when > switching to the new producer. > > On Mon, Jan 12, 2015 at 06:08:14PM -0800, Jay Kreps wrote: > > Hey Joe, > > > > Yeah I think a lot of those items are limitations in that document that > we > > should definitely fix. > > > > The first issue you point out is a serious one: We give the total list of > > errors but don't list which errors can result from which APIs. This is a > > big issue because actually no one knows and even if you know the code > base, > > determining that from the code is not trivial (since errors can percolate > > from lower layers). If you are writing a client, in practice, you just > try > > stuff and handle the errors that you've seen and add some generic catch > all > > for any new errors (which is actually a good forward-compatability > > practice). But it would be a lot easier if this kind of trial and error > > wasn't required. Having just done the Java producer and consumer I > > definitely felt that pain. > > > > The second issue I think we kind of tried to address by giving basic > usage > > info for things like metadata requests etc. But I think what you are > > pointing out is that this just isn't nearly detailed enough. Ideally we > > should give a lot of guidance on implementation options, optimizations, > > best practices, etc. I agree with this. Especially as we start to get the > > new consumer protocols in shape having this is really important for > helping > > people make use of them as there are several apis that work together. I > > think we could expand this section of the docs a lot. > > > > I think it also might be a good idea to move this document out of wiki > and > > into the main docs. This way we can version it with releases. Currently > > there is no way to tell which api versions are supported in which Kafka > > version as the document is always the current state of the protocol minus > > stuff on trunk that isn't released yet. This mostly works since in > practice > > if you are developing a client you should probably target the latest > > release, but it would be better to be able to tell what was in each > release. > > > > -Jay > > > > On Mon, Jan 12, 2015 at 5:50 PM, Joe Stein <joe.st...@stealth.ly> wrote: > > > > > Having an index for every protocol/API change (like > > > https://www.python.org/dev/peps/pep-0257/ ) will be much better than > the > > > flat wire protocol doc we have now. It is impossible ( without jumping > into > > > code ) right now to know if an error is supported in one version of > Kafka > > > vs another or different messages even. Having something that is > iterative > > > for each change that is explicit, clear and concise for developers for > > > client development would be wonderful. Some folks just try to keep pace > > > with the wire protocol doc regardless and often develop the wrong > > > functionality expected because functionality is not always part of the > > > protocol but an expectation / extension of the producer and/or consumer > > > layer from the project code. > > > > > > The "expected behavior" I think is a huge gap between the project and > > > client implementations. When you are a Kafka user you have certain > > > expectations when working with producers and consumers. e.g. if you > fail a > > > produced message the expectation is to retry X times with a Y backoff > > > between each try. The wire protocol doc doesn't always expose these > > > "features" that are expected behaviors and often get missed. > Assumptions > > > get made and in clients developed very large features take a while > (often > > > seen via production issues) to get found out. I think this problem > (which > > > is a big one IMHO) also will be better resolved with the KIP process. > > > Client application developers can look at new features, understand the > > > goals and expectations, develop those goals in the language/system > required > > > and support the byte structure(s) for a complete use case. > > > > > > I think child pages from > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol > > > might be a way to go. I only suggest that because people already use > that > > > page now and we can keep it as a high level "here is what you do" and > then > > > sub link to the child pages when appropriate. I hate completely > abandoning > > > something that is not entirely bad but just missing some updates in > > > different ways. So, maybe something like that or having a committed > > > specific part under git or svn might also make sense also. > > > > > > I am not really opinionated on how we implement as long as we do > implement > > > something for these issues. > > > > > > Feature and/or byte changes should bump the version number, +1 > > > > > > /******************************************* > > > Joe Stein > > > Founder, Principal Consultant > > > Big Data Open Source Security LLC > > > http://www.stealth.ly > > > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > > > ********************************************/ > > > > > > On Mon, Jan 12, 2015 at 8:27 PM, Jay Kreps <jay.kr...@gmail.com> > wrote: > > > > > > > Yeah I think this makes sense. Some of the crazy nesting will get > better > > > > when we move to the new protocol definition I think, but we will > always > > > > need some kind of if statement that branches for the different > behavior > > > and > > > > this makes testing difficult. > > > > > > > > Probably the best thing to do would be to announce a version > deprecated > > > > which will have no function but will serve as a warning that it is > going > > > > away and then remove it some time later. This would mean including > > > > something that notes this in the protocol docs and maybe the release > > > notes. > > > > We should probably just always do this for all but the latest > version of > > > > all apis. I think probably a year of deprecation should be sufficient > > > prior > > > > to removal? > > > > > > > > I also think we can maybe use some common sense in deciding this. > > > Removing > > > > older versions will always be bad for users and client developers and > > > > always be good for Kafka committers. I think we can be more > aggressive on > > > > things that are not heavily used (and hence less bad for users) or > for > > > > which supporting multiple versions is particularly onerous. > > > > > > > > -Jay > > > > > > > > On Mon, Jan 12, 2015 at 5:02 PM, Guozhang Wang <wangg...@gmail.com> > > > wrote: > > > > > > > > > +1 on version evolving with any protocol / data format / > functionality > > > > > changes, and I am wondering if we have a standard process of > > > deprecating > > > > > old versions? Today with just a couple of versions for the protocol > > > (e.g. > > > > > offset commit) the code on the server side is already pretty > nested and > > > > > complicated in order to support different version supports. > > > > > > > > > > On Mon, Jan 12, 2015 at 9:21 AM, Jay Kreps <j...@confluent.io> > wrote: > > > > > > > > > > > Hey Jun, > > > > > > > > > > > > Good points. > > > > > > > > > > > > I totally agree that the versioning needs to cover both format > and > > > > > behavior > > > > > > if the behavior change is incompatible. > > > > > > > > > > > > I kind of agree about the stable/unstable stuff. What I think > this > > > > means > > > > > is > > > > > > not that we would ever evolve the protocol without changing the > > > > version, > > > > > > but rather that we would drop support for older versions > quicker. On > > > > one > > > > > > hand that makes sense and it is often a high bar to get things > right > > > > the > > > > > > first time. On the other hand I think in practice the set of > people > > > who > > > > > > interact with the protocol is often different from the end user. > So > > > the > > > > > > end-user experience may still be "hey my code just broke" because > > > some > > > > > > client they use relied on an unstable protocol unbeknownst to > them. > > > > But I > > > > > > think all that means is that we should be thoughtful about > removing > > > > > support > > > > > > for old protocol versions even if they were marked unstable. > > > > > > > > > > > > Does anyone else have feedback or thoughts on the KIP stuff? > > > > Objections? > > > > > > Thoughts on structure? > > > > > > > > > > > > -Jay > > > > > > > > > > > > On Mon, Jan 12, 2015 at 8:20 AM, Jun Rao <j...@confluent.io> > wrote: > > > > > > > > > > > > > Jay, > > > > > > > > > > > > > > Thanks for bringing this up. Yes, we should increase the level > of > > > > > > awareness > > > > > > > of compatibility. > > > > > > > > > > > > > > For 1 and 2, they probably should include any functional > change. > > > For > > > > > > > example, even if there is no change in the binary data format, > but > > > > the > > > > > > > interpretation is changed, we should consider this as a binary > > > format > > > > > > > change and bump up the version number. > > > > > > > > > > > > > > 3. Having a wider discussion on api/protocol/data changes in > the > > > > > mailing > > > > > > > list seems like a good idea. > > > > > > > > > > > > > > 7. It might be good to also document api/protocol/data format > that > > > > are > > > > > > > considered stable (or unstable). For example, in 0.8.2 > release, we > > > > will > > > > > > > have a few new protocols (e.g. HeartBeat) for the development > of > > > the > > > > > new > > > > > > > consumer. Those new protocols probably shouldn't be considered > > > stable > > > > > > until > > > > > > > the new consumer is more fully developed. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Jan 9, 2015 at 4:29 PM, Jay Kreps <j...@confluent.io> > > > wrote: > > > > > > > > > > > > > > > Hey guys, > > > > > > > > > > > > > > > > We had a bit of a compatibility slip-up in 0.8.2 with the > offset > > > > > commit > > > > > > > > stuff. We caught this one before the final release so it's > not > > > too > > > > > bad. > > > > > > > But > > > > > > > > I do think it kind of points to an area we could do better. > > > > > > > > > > > > > > > > One piece of feedback we have gotten from going out and > talking > > > to > > > > > > users > > > > > > > is > > > > > > > > that compatibility is really, really important to them. > Kafka is > > > > > > getting > > > > > > > > deployed in big environments where the clients are embedded > in > > > lots > > > > > of > > > > > > > > applications and any kind of incompatibility is a huge pain > for > > > > > people > > > > > > > > using it and generally makes upgrade difficult or impossible. > > > > > > > > > > > > > > > > In practice what I think this means for development is a lot > more > > > > > > > pressure > > > > > > > > to really think about the public interfaces we are making > and try > > > > our > > > > > > > best > > > > > > > > to get them right. This can be hard sometimes as changes > come in > > > > > > patches > > > > > > > > and it is hard to follow every single rb with enough > diligence to > > > > > know. > > > > > > > > > > > > > > > > Compatibility really means a couple things: > > > > > > > > 1. Protocol changes > > > > > > > > 2. Binary data format changes > > > > > > > > 3. Changes in public apis in the clients > > > > > > > > 4. Configs > > > > > > > > 5. Metric names > > > > > > > > 6. Command line tools > > > > > > > > > > > > > > > > I think 1-2 are critical. 3 is very important. And 4, 5 and > 6 are > > > > > > pretty > > > > > > > > important but not critical. > > > > > > > > > > > > > > > > One thing this implies is that we are really going to have > to do > > > a > > > > > good > > > > > > > job > > > > > > > > of thinking about apis and use cases. You can definitely see > a > > > > number > > > > > > of > > > > > > > > places in the old clients and in a couple of the protocols > where > > > > > enough > > > > > > > > care was not given to thinking things through. Some of those > were > > > > > from > > > > > > > long > > > > > > > > long ago, but we should really try to avoid adding to that > set > > > > > because > > > > > > > > increasingly we will have to carry around these mistakes for > a > > > long > > > > > > time. > > > > > > > > > > > > > > > > Here are a few things I thought we could do that might help > us > > > get > > > > > > better > > > > > > > > in this area: > > > > > > > > > > > > > > > > 1. Technically we are just in a really bad place with the > > > protocol > > > > > > > because > > > > > > > > it is defined twice--once in the old scala request objects, > and > > > > once > > > > > in > > > > > > > the > > > > > > > > new protocol format for the clients. This makes changes > massively > > > > > > > painful. > > > > > > > > The good news is that the new request definition DSL was > intended > > > > to > > > > > > make > > > > > > > > adding new protocol versions a lot easier and clearer. It > will > > > also > > > > > > make > > > > > > > it > > > > > > > > a lot more obvious when the protocol is changed since you > will be > > > > > > > checking > > > > > > > > in or reviewing a change to Protocol.java. Getting the server > > > moved > > > > > > over > > > > > > > to > > > > > > > > the new request objects and protocol definition will be a > bit of > > > a > > > > > slog > > > > > > > but > > > > > > > > it will really help here I think. > > > > > > > > > > > > > > > > 2. We need to get some testing in place on cross-version > > > > > compatibility. > > > > > > > > This is work and no tests here will be perfect, but I suspect > > > with > > > > > some > > > > > > > > effort we could catch a lot of things. > > > > > > > > > > > > > > > > 3. I was also thinking it might be worth it to get a little > bit > > > > more > > > > > > > formal > > > > > > > > about the review and discussion process for things which will > > > have > > > > > > impact > > > > > > > > to these public areas to ensure we end up with something we > are > > > > happy > > > > > > > with. > > > > > > > > Python has a PIP process ( > > > > https://www.python.org/dev/peps/pep-0257/) > > > > > > by > > > > > > > > which major changes are made, and it might be worth it for > us to > > > > do a > > > > > > > > similar thing. We have essentially been doing this > already--major > > > > > > changes > > > > > > > > almost always have an associated wiki, but I think just > getting a > > > > > > little > > > > > > > > more rigorous might be good. The idea would be to just call > out > > > > these > > > > > > > wikis > > > > > > > > as official proposals and do a full Apache discuss/vote > thread > > > for > > > > > > these > > > > > > > > important change. We would use these for big features > (security, > > > > log > > > > > > > > compaction, etc) as well as for small changes that introduce > or > > > > > change > > > > > > a > > > > > > > > public api/config/etc. This is a little heavier weight, but I > > > think > > > > > it > > > > > > is > > > > > > > > really just critical that we get these things right and this > > > would > > > > > be a > > > > > > > way > > > > > > > > to call out this kind of change so that everyone would take > the > > > > time > > > > > to > > > > > > > > look at them. > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > -- Guozhang > > > > > > > > > > > > > >