RDD and DF are not compatible data types. So you cannot return a DF when you have to return an RDD. What rather you can do is return the underlying RDD of the dataframe by dataframe.rdd().
On Fri, Oct 16, 2015 at 12:07 PM, Jason White <jason.wh...@shopify.com> wrote: > Hi Ken, thanks for replying. > > Unless I'm misunderstanding something, I don't believe that's correct. > Dstream.transform() accepts a single argument, func. func should be a > function that accepts a single RDD, and returns a single RDD. That's what > transform_to_df does, except the RDD it returns is a DF. > > I've used Dstream.transform() successfully in the past when transforming > RDDs, so I don't think my problem is there. > > I haven't tried this in Scala yet, and all of the examples I've seen on the > website seem to use foreach instead of transform. Does this approach work > in > Scala? > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-Streaming-DataFrames-tp25095p25099.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >