Hi Oleg,

Did you ever figure this out?  I'm observing the same exception also in
0.9.1 and think it might be related to setting spark.speculation=true.  My
theory is that multiple attempts at the same task start, the first finishes
and cleans up the _temporary directory, and then the second fails because
the _temporary is no longer there.

Thanks!
Andrew


On Mon, Jun 9, 2014 at 1:19 PM, Oleg Proudnikov <oleg.proudni...@gmail.com>
wrote:

> Hi All,
>
> After a few simple transformations I am trying to save to a local file
> system. The code works in local mode but not on a standalone cluster. The
> directory *10000.txt/_temporary* does exist after the exception.
>
> I would appreciate any suggestions.
>
>
> *scala> d3.sample(false,0.01,1).map( pair => pair._2
> ).saveAsTextFile("10000.txt")*
>
>
> 14/06/09 22:06:40 ERROR TaskSetManager: Task 0.0:0 failed 4 times;
> aborting job
> *org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed 4 times
> (most recent failure: Exception failure: java.io.IOException: The temporary
> job-output directory
> file:/data/spark-0.9.1-bin-hadoop1/10000.txt/_temporary doesn't exist!)*
>  at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
>  at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
>  at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>  at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
>  at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
>  at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
>  at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
>  at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>  at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>  at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>  at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>  at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
>
> Thank you,
> Oleg
>
>

Reply via email to