Hello,
Is there an easy way to convert RDDs within a DStream into Parquet records?
Here is some incomplete pseudo code:
// Create streaming context
val ssc = new StreamingContext(...)
// Obtain a DStream of events
val ds = KafkaUtils.createStream(...)
// Get Spark context to get to the SQL context
val sc = ds.context.sparkContext
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
// For each RDD
ds.foreachRDD((rdd: RDD[Array[Byte]]) => {
// What do I do next?
})
Thanks,
Mahesh
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-RDDs-to-Parquet-records-tp7762.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.