Do you see this error right in the beginning or after running for sometime?
The root cause seems to be that somehow your Spark executors got killed, which killed receivers and caused further errors. Please try to take a look at the executor logs of the lost executor to find what is the root cause that caused the executor to fail. TD On Thu, Aug 28, 2014 at 3:54 PM, Tim Smith <secs...@gmail.com> wrote: > Hi, > > Have a Spark-1.0.0 (CDH5) streaming job reading from kafka that died with: > > 14/08/28 22:28:15 INFO DAGScheduler: Failed to run runJob at > ReceiverTracker.scala:275 > Exception in thread "Thread-59" 14/08/28 22:28:15 INFO > YarnClientClusterScheduler: Cancelling stage 2 > 14/08/28 22:28:15 INFO DAGScheduler: Executor lost: 5 (epoch 4) > 14/08/28 22:28:15 INFO BlockManagerMasterActor: Trying to remove executor > 5 from BlockManagerMaster. > 14/08/28 22:28:15 INFO BlockManagerMaster: Removed 5 successfully in > removeExecutor > org.apache.spark.SparkException: Job aborted due to stage failure: Task > 2.0:0 failed 4 times, most recent failure: TID 6481 on host > node-dn1-1.ops.sfdc.net failed for unknown reason > Driver stacktrace: > at org.apache.spark.scheduler.DAGScheduler.org > $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > > > Any insights into this error? > > Thanks, > > Tim > >