Re: takeSample triggers 2 jobs

2015-03-06 Thread Denny Lee
Hi Rares, If you dig into the descriptions for the two jobs, it will probably return something like: Job ID: 1 org.apache.spark.rdd.RDD.takeSample(RDD.scala:447) $line41.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:22) ... Job ID: 0 org.apache.spark.rdd.RDD.takeSample(RDD.scala:428) $line41.$

takeSample triggers 2 jobs

2015-03-06 Thread Rares Vernica
Hello, I am using takeSample from the Scala Spark 1.2.1 shell: scala> sc.textFile("README.md").takeSample(false, 3) and I notice that two jobs are generated on the Spark Jobs page: Job Id Description 1 takeSample at :13 0 takeSample at :13 Any ideas why the two jobs are needed? Thanks! Rar