subject:"Re\: Streaming Receiverless Kafka API \+ Offset Management"

Re: Streaming Receiverless Kafka API + Offset Management

2015-11-16 Thread Saisai Shao

Kafka now build-in supports managing metadata itself besides ZK, it is easy to use and change from current ZK implementation. I think here the problem is do we need to manage offset in Spark Streaming level or leave this question to user. If you want to manage offset in user level, letting Spark t

Re: Streaming Receiverless Kafka API + Offset Management

2015-11-16 Thread Nick Evans

The only dependancy on Zookeeper I see is here: https://github.com/apache/spark/blob/1c5475f1401d2233f4c61f213d1e2c2ee9673067/external/kafka/src/main/scala/org/apache/spark/streaming/kafka/ReliableKafkaReceiver.scala#L244-L247 If that's the only line that depends on Zookeeper, we could probably tr

Re: Streaming Receiverless Kafka API + Offset Management

2015-11-16 Thread Cody Koeninger

There are already private methods in the code for interacting with Kafka's offset management api. There's a jira for making those methods public, but TD has been reluctant to merge it https://issues.apache.org/jira/browse/SPARK-10963 I think adding any ZK specific behavior to spark is a bad idea