Correct what i am trying to achieve is that before the streaming job starts query the topic meta data from kafka , determine all the partition and provide those to direct API.
So my question is should i consider passing all the partition from command line and query kafka and find and provide , what is the correct approach. Ashish On Mon, Jan 25, 2016 at 11:38 AM, Gerard Maas <gerard.m...@gmail.com> wrote: > What are you trying to achieve? > > Looks like you want to provide offsets but you're not managing them > and I'm assuming you're using the direct stream approach. > > In that case, use the simpler constructor that takes the kafka config and > the topics. Let it figure it out the offsets (it will contact kafka and > request the partitions for the topics provided) > > KafkaUtils.createDirectStream[...](ssc, kafkaConfig, topics) > > -kr, Gerard > > On Mon, Jan 25, 2016 at 5:31 PM, Ashish Soni <asoni.le...@gmail.com> > wrote: > >> Hi All , >> >> What is the best way to tell spark streaming job for the no of partition >> to to a given topic - >> >> Should that be provided as a parameter or command line argument >> or >> We should connect to kafka in the driver program and query it >> >> Map<TopicAndPartition, Long> fromOffsets = new HashMap<TopicAndPartition, >> Long>(); >> fromOffsets.put(new TopicAndPartition(driverArgs.inputTopic, 0), 0L); >> >> Thanks, >> Ashish >> > >