Re: How to parse Json formatted Kafka message in spark streaming

2015-03-05 Thread Ted Yu
> To: Akhil Das > Cc: Cui Lin , "user@spark.apache.org" < > user@spark.apache.org> > Subject: Re: How to parse Json formatted Kafka message in spark streaming > > Cui: > You can check messages.partitions.size to determine whether messages is > an empty RDD.

Re: How to parse Json formatted Kafka message in spark streaming

2015-03-05 Thread Cui Lin
From: Ted Yu mailto:yuzhih...@gmail.com>> Date: Thursday, March 5, 2015 at 6:33 AM To: Akhil Das mailto:ak...@sigmoidanalytics.com>> Cc: Cui Lin mailto:cui@hds.com>>, "user@spark.apache.org<mailto:user@spark.apache.org>" mailto:user@spark.apache.org>>

Re: How to parse Json formatted Kafka message in spark streaming

2015-03-05 Thread Cui Lin
hih...@gmail.com>> Cc: Akhil Das mailto:ak...@sigmoidanalytics.com>>, Cui Lin mailto:cui@hds.com>>, "user@spark.apache.org<mailto:user@spark.apache.org>" mailto:user@spark.apache.org>> Subject: Re: How to parse Json formatted Kafka message in spark st

Re: How to parse Json formatted Kafka message in spark streaming

2015-03-05 Thread Helena Edelson
Great point :) Cui, Here’s a cleaner way than I had before, w/out the use of spark sql for the mapping: KafkaUtils.createStream[String, String, StringDecoder, StringDecoder]( ssc, kafka.kafkaParams, Map("github" -> 5), StorageLevel.MEMORY_ONLY) .map{ case (k,v) => JsonParser.parse(v).extract[

Re: How to parse Json formatted Kafka message in spark streaming

2015-03-05 Thread Ted Yu
Cui: You can check messages.partitions.size to determine whether messages is an empty RDD. Cheers On Thu, Mar 5, 2015 at 12:52 AM, Akhil Das wrote: > When you use KafkaUtils.createStream with StringDecoders, it will return > String objects inside your messages stream. To access the elements fro

Re: How to parse Json formatted Kafka message in spark streaming

2015-03-05 Thread Helena Edelson
Hi Cui, What version of Spark are you using? There was a bug ticket that may be related to this, fixed in core/src/main/scala/org/apache/spark/rdd/RDD.scala that is merged into versions 1.3.0 and 1.2.1 . If you are using 1.1.1 that may be the reason but it’s a stretch https://issues.apache.org/

Re: How to parse Json formatted Kafka message in spark streaming

2015-03-05 Thread Akhil Das
When you use KafkaUtils.createStream with StringDecoders, it will return String objects inside your messages stream. To access the elements from the json, you could do something like the following: val mapStream = messages.map(x=> { val mapper = new ObjectMapper() with ScalaObjectMapper