Hi all,

I have a scenario like this:

val df = dataframe.map().filter()
// agg 1
val query1 = df.sum.writeStream.start
// agg 2
val query2 = df.count.writeStream.start

With spark streaming, we can apply persist() on rdd to reuse the df computation 
result, when we call persist() after filter() map().filter() operator only run 
once.
With SS, we can’t apply persist() direct on dataframe. query1 and query2 will 
not reuse result after filter. map/filter run twice. So is there a way to solve 
this. 

Regards,

Shu li Zheng

Reply via email to