Hi,
I am wondering why in web UI some stages (like join, filter) are not
visible. For example this code:
val simple = sc.parallelize(Array.range(0,100))
val simple2 = sc.parallelize(Array.range(0,100))
val toJoin = simple.map(x => (x, x.toString + x.toString))
val rdd = simple2
.map(x => (scala.util.Random.nextInt(100), x))
.join(toJoin)
.map { case (r, (x, s)) => (r, x)}
.reduceByKey(_ + _)
.sortByKey()
.cache()
rdd.saveAsTextFile("output/1")
val rdd2 = toJoin
.groupBy{ case (x, _) => x}
.filter{ case (x, _) => x < 10}
rdd2.saveAsTextFile("output/2")
println(rdd2.join(toJoin).count())
in UI doesn't show join and filter stages and moreover it shows sortByKey
and reduceByKey twice.
Could anyone explain how it works?
Thanks,
Grzegorz