What do you mean not distinct?
It does works for me:
[image: Inline image 1]
Code:
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.{SparkContext, SparkConf}
val ssc = new StreamingContext(sc, Seconds(1))
val data =
ssc.textFileStream("/home/akhld/mobi/localcluster/spark-1/sigmoid/")
val dist = data.transform(_.distinct())
dist.print()
ssc.start()
ssc.awaitTermination()
Thanks
Best Regards
On Fri, Mar 20, 2015 at 11:07 PM, Darren Hoo <[email protected]> wrote:
> val aDstream = ...
>
> val distinctStream = aDstream.transform(_.distinct())
>
> but the elements in distinctStream are not distinct.
>
> Did I use it wrong?
>