Hello, I am currently working on a Kafka implementation and have a couple of questions concerning the road map for the future. As I am unsure where to put such questions, I decided to try my luck on this mailing list. If this is the wrong place for such inquiries, I apologize. In this case it would be great if someone could offer some pointers as to where to find/get these answers.
So, here I go :) 1) Consumer Redesign in Kafka 0.9 I found a number of documents explaining planned changes to the consumer APIs for Kafka version 0.9. However, these documents are only mentioning the high level consumer implementations. Does anyone know if the kafka.javaapi.consumer.SimpleConsumer API/implementation will also change with 0.9? Or will that stay more or less as it is now? 2) Pooling of Kafka Connections - SimpleConsumer As I have a use case where the connection between the final consumers and Kafka needs to happen via HTTP, I am concerned about performance implications of the required HTTP wrapping. I am planning to implement a custom HTTP API for Kafka producers and consumers which will be stateless and where offset tracking will be done on the final consumer side. Now the question here would be whether anyone has made experiences with pooling connections to Kafka brokers in order to reuse them effectively for incoming, stateless HTTP REST calls. An idea here would be to have one connection pool per broker host and to keep a set of open consumers/connections for each broker in those pools. Once I know which broker is the leader for a requested topic partition for a REST call, I could then use an already existing consumer/connection from that pool for the processing of that REST call and then return it to the pool. So I'd be able to have completely stateless REST call handling without having to open/close Kafka connections all the time. 3) Pooling of Kafka Connections - KafkaConsumer (Kafka 0.9) Now let's assume I want to implement the idea from 2) but with the high level KafkaConsumer (to leave identifications of partition leaders and error handling to it). Are already any implementation details known/decided on how the subscribe, unsubscribe and seek methods will work internally? Would I be able to somehow reuse connected KafkaConsumer objects in connection pools? Could I for example call subscribe/unsubscribe/seek for each HTTP request on a consumer to switch topics/partitions to the currently needed set or would this be a very expensive operation (i.e. because it would fetch metadata from Kafka to identify the leader for each partition)? Greetings Valentin