Hi Rares,
If you dig into the descriptions for the two jobs, it will probably return
something like:
Job ID: 1
org.apache.spark.rdd.RDD.takeSample(RDD.scala:447)
$line41.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:22)
...
Job ID: 0
org.apache.spark.rdd.RDD.takeSample(RDD.scala:428)
$line41.$
Hello,
I am using takeSample from the Scala Spark 1.2.1 shell:
scala> sc.textFile("README.md").takeSample(false, 3)
and I notice that two jobs are generated on the Spark Jobs page:
Job Id Description
1 takeSample at :13
0 takeSample at :13
Any ideas why the two jobs are needed?
Thanks!
Rar