subject:"Re\: Determine Topic MetaData Spark Streaming Job"

Re: Determine Topic MetaData Spark Streaming Job

2016-01-25 Thread Gerard Maas

That's precisely what this constructor does: KafkaUtils.createDirectStream[...](ssc, kafkaConfig, topics) Is there a reason to do that yourself? In that case, look at how it's done in Spark Streaming for inspiration: https://github.com/apache/spark/blob/master/external/kafka/src/main/scala/org/ap

Re: Determine Topic MetaData Spark Streaming Job

2016-01-25 Thread Ashish Soni

Correct what i am trying to achieve is that before the streaming job starts query the topic meta data from kafka , determine all the partition and provide those to direct API. So my question is should i consider passing all the partition from command line and query kafka and find and provide , wha

Re: Determine Topic MetaData Spark Streaming Job

2016-01-25 Thread Gerard Maas

What are you trying to achieve? Looks like you want to provide offsets but you're not managing them and I'm assuming you're using the direct stream approach. In that case, use the simpler constructor that takes the kafka config and the topics. Let it figure it out the offsets (it will contact kaf