Hi all, As a client engineer on the python client, I would really appreciate a separate mailing list for client implementation discussion and a language agnostic test suite. What might also be really useful is an enumerated list of error conditions and the expected behavior to come out of them. For instance, what do you do if you have a multi-partition producer that tries to produce to a non-existent topic? The metadata request is going to return nothing, which means you don't know where to send the request at all. You could just arbitrarily send it to a broker I guess?
At any rate, I have lots of questions about a formalized "certified client" process. I'm not against the idea (in fact quite the opposite), but I'm concerned that non-Java clients will be constrained purely to the currently existing Java API in the name of client uniformity and standardization. -Mark On Sat, Jul 19, 2014 at 12:30 AM, Timothy Chen <tnac...@gmail.com> wrote: > The certified client test suite really will benefit all the client > developers, as writing a Kafka client often is not just talking protocol > but to be able to handle correctly all the cases, errors and situations, > but also performance. > > From my experience writing a C# client definitely feel that a lot of test > scenarios could be generalized and used for all clients. > > I was reviewing some other client implementation and there are errors and > cases it didn't handle and having a suite that exposes that will allow > users to not run knot those problems and try to determine its a client or > server bug as it's sometimes hard to figure out. > > Tim > > > On Jul 18, 2014, at 3:57 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > > > > Basically my thought with getting a separate mailing list was to have > > a place specifically to discuss issues around clients. I don't see a > > lot of discussion about them on the main list. I thought perhaps this > > was because people don't like to ask questions which are about > > adjacent projects/code bases. But basically whatever will lead to a > > robust discussion, bug tracking, etc on clients. > > > > -Jay > > > >> On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao <jun...@gmail.com> wrote: > >> Another important part of eco-system could be around the adaptors of > >> getting data from other systems into Kafka and vice versa. So, for the > >> ingestion part, this can include things like getting data from mysql, > >> syslog, apache server log, etc. For the egress part, this can include > >> putting Kafka data into HDFS, S3, etc. > >> > >> Will a separate mailing list be convenient? Could we just use the Kafka > >> mailing list? > >> > >> Thanks, > >> > >> Jun > >> > >> > >>> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps <jay.kr...@gmail.com> > wrote: > >>> > >>> A question was asked in another thread about what was an effective way > >>> to contribute to the Kafka project for people who weren't very > >>> enthusiastic about writing Java/Scala code. > >>> > >>> I wanted to kind of advocate for an area I think is really important > >>> and not as good as it could be--the client ecosystem. I think our goal > >>> is to make Kafka effective as a general purpose, centralized, data > >>> subscription system. This vision only really works if all your > >>> applications, are able to integrate easily, whatever language they are > >>> in. > >>> > >>> We have a number of pretty good non-java producers. We have been > >>> lacking the features on the server-side to make writing non-java > >>> consumers easy. We are fixing that right now as part of the consumer > >>> work going on right now (which moves a lot of the functionality in the > >>> java consumer to the server side). > >>> > >>> But apart from this I think there may be a lot more we can do to make > >>> the client ecosystem better. > >>> > >>> Here are some concrete ideas. If anyone has additional ideas please > >>> reply to this thread and share them. If you are interested in picking > >>> any of these up, please do. > >>> > >>> 1. The most obvious way to improve the ecosystem is to help work on > >>> clients. This doesn't necessarily mean writing new clients, since in > >>> many cases we already have a client in a given language. I think any > >>> way we can incentivize fewer, better clients rather than many > >>> half-working clients we should do. However we are working now on the > >>> server-side consumer co-ordination so it should now be possible to > >>> write much simpler consumers. > >>> > >>> 2. It would be great if someone put together a mailing list just for > >>> client developers to share tips, tricks, problems, and so on. We can > >>> make sure all the main contributors on this too. I think this could be > >>> a forum for kind of directing improvements in this area. > >>> > >>> 3. Help improve the documentation on how to implement a client. We > >>> have tried to make the protocol spec not just a dry document but also > >>> have it share best practices, rationale, and intentions. I think this > >>> could potentially be even better as there is really a range of options > >>> from a very simple quick implementation to a more complex highly > >>> optimized version. It would be good to really document some of the > >>> options and tradeoffs. > >>> > >>> 4. Come up with a standard way of documenting the features of clients. > >>> In an ideal world it would be possible to get the same information > >>> (author, language, feature set, download link, source code, etc) for > >>> all clients. It would be great to standardize the documentation for > >>> the client as well. For example having one or two basic examples that > >>> are repeated for every client in a standardized way. This would let > >>> someone come to the Kafka site who is not a java developer, and click > >>> on the link for their language and view examples of interacting with > >>> Kafka in the language they know using the client they would eventually > >>> use. > >>> > >>> 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this: > >>> anyone who wants to implement a client would implement a simple > >>> command line program with a set of standardized options. The > >>> compatibility kit would be a standard set of scripts that ran their > >>> client using this command line driver and validate its behavior. E.g. > >>> for a producer it would test that it correctly can send messages, that > >>> the ordering is retained, that the client correctly handles > >>> reconnection and metadata refresh, and compression. The output would > >>> be a list of features that passed are certified, and perhaps basic > >>> performance information. This would be an easy way to help client > >>> developers write correct clients, as well as having a standardized > >>> comparison for the clients that says that they work correctly. > >>> > >>> -Jay > >>> >