Something like:

dstream.foreachRDD { rdd =>
  val df =  sqlContext.read.json(rdd)
  df.select(…)
}

https://spark.apache.org/docs/latest/streaming-programming-guide.html#output-operations-on-dstreams


Might be the place to start, it’ll convert each batch of dstream into an RDD 
then let you work it as if it were a standard RDD dataset.

Ewan


From: Daniel Haviv [mailto:daniel.ha...@veracity-group.com]
Sent: 29 September 2015 15:03
To: user <user@spark.apache.org>
Subject: Converting a DStream to schemaRDD

Hi,
I have a DStream which is a stream of RDD[String].

How can I pass a DStream to sqlContext.jsonRDD and work with it as a DF ?

Thank you.
Daniel

Reply via email to