Questions about Kafka 0.9 API changes

Valentin Mon, 22 Sep 2014 08:11:26 -0700

Hello,

I am currently working on a Kafka implementation and have a couple of
questions concerning the road map for the future.
As I am unsure where to put such questions, I decided to try my luck on
this mailing list. If this is the wrong place for such inquiries, I
apologize. In this case it would be great if someone could offer some
pointers as to where to find/get these answers.


So, here I go :)

1) Consumer Redesign in Kafka 0.9
I found a number of documents explaining planned changes to the consumer
APIs for Kafka version 0.9. However, these documents are only mentioning
the high level consumer implementations. Does anyone know if the
kafka.javaapi.consumer.SimpleConsumer API/implementation will also change
with 0.9? Or will that stay more or less as it is now?

2) Pooling of Kafka Connections - SimpleConsumer
As I have a use case where the connection between the final consumers and
Kafka needs to happen via HTTP, I am concerned about performance
implications of the required HTTP wrapping. I am planning to implement a
custom HTTP API for Kafka producers and consumers which will be stateless
and where offset tracking will be done on the final consumer side. Now the
question here would be whether anyone has made experiences with pooling
connections to Kafka brokers in order to reuse them effectively for
incoming, stateless HTTP REST calls. An idea here would be to have one
connection pool per broker host and to keep a set of open
consumers/connections for each broker in those pools. Once I know which
broker is the leader for a requested topic partition for a REST call, I
could then use an already existing consumer/connection from that pool for
the processing of that REST call and then return it to the pool. So I'd be
able to have completely stateless REST call handling without having to
open/close Kafka connections all the time.

3) Pooling of Kafka Connections - KafkaConsumer (Kafka 0.9)
Now let's assume I want to implement the idea from 2) but with the high
level KafkaConsumer (to leave identifications of partition leaders and
error handling to it). Are already any implementation details known/decided
on how the subscribe, unsubscribe and seek methods will work internally?
Would I be able to somehow reuse connected KafkaConsumer objects in
connection pools? Could I for example call subscribe/unsubscribe/seek for
each HTTP request on a consumer to switch topics/partitions to the
currently needed set or would this be a very expensive operation (i.e.
because it would fetch metadata from Kafka to identify the leader for each
partition)?

Greetings
Valentin

Questions about Kafka 0.9 API changes

Reply via email to