Hi, I think it would be great if we could do the string parsing only once and then just apply the transformation for each interval (reducing the processing overhead for short intervals).
Also, one issue with the approach above is that transform() has the following signature: def transform(transformFunc: RDD[T] => RDD[U]): DStream[U] and therefore, in my example val result = lines.transform((rdd, time) => { >> // execute statement >> rdd.registerAsTable("data") >> sqlc.sql(query) >> }) >> > the variable `result ` is of type DStream[Row]. That is, the meta-information from the SchemaRDD is lost and, from what I understand, there is then no way to learn about the column names of the returned data, as this information is only encoded in the SchemaRDD. I would love to see a fix for this. Thanks Tobias