Hi,

Thank you for your support! everything is working except can't figure out
how to pass the video frame( serialize in a pickle file) to spark.

My problem is during loading the pickle file stream got an EOF error into
spark stream context. I'm suspecting that due to large file size TCP broke
that file into some chunk and while receiving in spark, got an RDD of a
single chunk which is not a valid pickle file.

I'm wondering how to get a complete pickle file from kafka to spark.

Thank you!


On 25 July 2018 at 14:51, Attila Sasvári <asasv...@apache.org> wrote:

> Hi Biswajit,
>
> Can you please provide more information:
>
> - What other symptoms do you see? Are all your Kafka brokers up and
> running?
> - What replication did you set for offsets.topic.replication.factor (i.e.
> replication factor of __consumer_offsets) in your Kafka broker's config ?
> What is set for min.insync.replicas on the __consumer_offsets topic?
> - What does the following command show? kafka-topics.sh --describe
> --zookeeper <ZK_HOST:ZK_PORT>
> - What is the generated consumer group id  (that is used to select the
> consumer group coordinator broker) of your pyspark client? Is it different
> from the one used by kafka console consumer? How do you commit the consumer
> offsets? I suspect that the consumer offset for the consumer group might
> have already been established in Kafka and that is why you are not able to
> get records from your pyspark application. Can you try to run your
> application using a consumer group id that did not exists before in your
> cluster?
> - What is the version of Kafka you are using (broker and
> spark-streaming-kafka)?
>
> Regards,
> Attila
>
> On Tue, Jul 24, 2018 at 3:51 PM Biswajit Ghosh <
> biswaji...@aqbsolutions.com>
> wrote:
>
> > Yes, I have double check that.
> >
> > On 24 July 2018 at 19:20, Aman Rastogi <amanr.rast...@gmail.com> wrote:
> >
> > > Is your topic same in both the case?
> > >
> > > On Tue, 24 Jul 2018, 19:15 Biswajit Ghosh, <
> biswaji...@aqbsolutions.com>
> > > wrote:
> > >
> > > > Hi team,
> > > >
> > > > I got an issue while integrating with the spark streaming using
> > pyspark,
> > > I
> > > > did receive the video stream data in a different consumer subscribe
> to
> > > the
> > > > same topic.
> > > >
> > > > Works fine with this command : *./kafka-console-consumer.sh
> > > > --bootstrap-server <IP_ADDRESS>:9092 --topic spark-streaming-consumer
> > > > --from-beginning*
> > > >
> > > > But not with this :
> > > >
> > > > ​
> > > > >
> > > > def processRecord(record):
> > > >
> > > >         print(record)
> > > >
> > > >
> > > > > ​sc = SparkContext(master="local[2]",appName="HNStreaming")
> > > >
> > > > ​sc.setLogLevel('DEBUG')
> > > >
> > > > ssc = StreamingContext(sc, 2)
> > > >
> > > > topic = "spark-stream-message"
> > > >
> > > > kvs = KafkaUtils.createDirectStream(ssc, [topic],
> > > {'metadata.broker.list':
> > > > > brokers})
> > > >
> > > > kvs.foreachRDD(self.processRecord)
> > > >
> > > > ssc.start()
> > > >
> > > > ssc.awaitTermination()
> > > >
> > > > ​
> > > > >
> > > > >
> > > >
> > > > Expecting help from your side asap.
> > > >
> > > > Thank you!
> > > >
> > > >
> > > > --
> > > >
> > > > Regards,
> > > > biswajitGhosh
> > > >
> > >
> >
> >
> >
> > --
> >
> > Regards,
> > biswajitGhosh
> >
>



-- 

Regards,
biswajitGhosh

Reply via email to