A question was asked in another thread about what was an effective way to contribute to the Kafka project for people who weren't very enthusiastic about writing Java/Scala code.
I wanted to kind of advocate for an area I think is really important and not as good as it could be--the client ecosystem. I think our goal is to make Kafka effective as a general purpose, centralized, data subscription system. This vision only really works if all your applications, are able to integrate easily, whatever language they are in. We have a number of pretty good non-java producers. We have been lacking the features on the server-side to make writing non-java consumers easy. We are fixing that right now as part of the consumer work going on right now (which moves a lot of the functionality in the java consumer to the server side). But apart from this I think there may be a lot more we can do to make the client ecosystem better. Here are some concrete ideas. If anyone has additional ideas please reply to this thread and share them. If you are interested in picking any of these up, please do. 1. The most obvious way to improve the ecosystem is to help work on clients. This doesn't necessarily mean writing new clients, since in many cases we already have a client in a given language. I think any way we can incentivize fewer, better clients rather than many half-working clients we should do. However we are working now on the server-side consumer co-ordination so it should now be possible to write much simpler consumers. 2. It would be great if someone put together a mailing list just for client developers to share tips, tricks, problems, and so on. We can make sure all the main contributors on this too. I think this could be a forum for kind of directing improvements in this area. 3. Help improve the documentation on how to implement a client. We have tried to make the protocol spec not just a dry document but also have it share best practices, rationale, and intentions. I think this could potentially be even better as there is really a range of options from a very simple quick implementation to a more complex highly optimized version. It would be good to really document some of the options and tradeoffs. 4. Come up with a standard way of documenting the features of clients. In an ideal world it would be possible to get the same information (author, language, feature set, download link, source code, etc) for all clients. It would be great to standardize the documentation for the client as well. For example having one or two basic examples that are repeated for every client in a standardized way. This would let someone come to the Kafka site who is not a java developer, and click on the link for their language and view examples of interacting with Kafka in the language they know using the client they would eventually use. 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this: anyone who wants to implement a client would implement a simple command line program with a set of standardized options. The compatibility kit would be a standard set of scripts that ran their client using this command line driver and validate its behavior. E.g. for a producer it would test that it correctly can send messages, that the ordering is retained, that the client correctly handles reconnection and metadata refresh, and compression. The output would be a list of features that passed are certified, and perhaps basic performance information. This would be an easy way to help client developers write correct clients, as well as having a standardized comparison for the clients that says that they work correctly. -Jay