Re: Determine Topic MetaData Spark Streaming Job

Ashish Soni Mon, 25 Jan 2016 08:54:13 -0800

Correct what i am trying to achieve is that before the streaming job starts
query the topic meta data from kafka , determine all the partition and
provide those to direct API.


So my question is should i consider passing all the partition from command
line and query kafka and find and provide , what is the correct approach.

Ashish

On Mon, Jan 25, 2016 at 11:38 AM, Gerard Maas <gerard.m...@gmail.com> wrote:

> What are you trying to achieve?
>
> Looks like you want to provide offsets but you're not managing them
> and I'm assuming you're using the direct stream approach.
>
> In that case, use the simpler constructor that takes the kafka config and
> the topics. Let it figure it out the offsets (it will contact kafka and
> request the partitions for the topics provided)
>
> KafkaUtils.createDirectStream[...](ssc, kafkaConfig, topics)
>
>  -kr, Gerard
>
> On Mon, Jan 25, 2016 at 5:31 PM, Ashish Soni <asoni.le...@gmail.com>
> wrote:
>
>> Hi All ,
>>
>> What is the best way to tell spark streaming job for the no of partition
>> to to a given topic -
>>
>> Should that be provided as a parameter or command line argument
>> or
>> We should connect to kafka in the driver program and query it
>>
>> Map<TopicAndPartition, Long> fromOffsets = new HashMap<TopicAndPartition,
>> Long>();
>> fromOffsets.put(new TopicAndPartition(driverArgs.inputTopic, 0), 0L);
>>
>> Thanks,
>> Ashish
>>
>
>

Re: Determine Topic MetaData Spark Streaming Job

Reply via email to