what could cause this type of 'stage failure'? Thanks! This is a simple py spark script to list data in hbase. command line: ./spark-submit --driver-class-path ~/spark-examples-1.1.0-hadoop2.3.0.jar /root/workspace/test/sparkhbase.py
14/10/21 17:53:50 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on ip-***.ec2.internal:35201 (size: 1470.0 B, free: 265.4 MB) 14/10/21 17:53:50 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0 14/10/21 17:53:50 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (MappedRDD[1] at map at PythonHadoopUtil.scala:185) 14/10/21 17:53:50 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 14/10/21 17:53:53 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@ip-***.ec2.internal:34050/user/Executor#681287499] with ID 0 14/10/21 17:53:53 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-****.internal, ANY, 1264 bytes) 14/10/21 17:53:53 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@ip-***.ec2.internal:47483/user/Executor#-936252397] with ID 1 14/10/21 17:53:53 INFO BlockManagerMasterActor: Registering block manager ip-****2.internal:49236 with 3.1 GB RAM 14/10/21 17:53:54 INFO BlockManagerMasterActor: Registering block manager ip-.ec2.internal:36699 with 3.1 GB RAM 14/10/21 17:53:54 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, ip-.ec2.internal): java.lang.IllegalStateException: unread block data java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) 14/10/21 17:53:54 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, ip-.internal, ANY, 1264 bytes) 14/10/21 17:53:54 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) on executor ip-.internal: java.lang.IllegalStateException (unread block data) [duplicate 1] 14/10/21 17:53:54 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 2, ip-.internal, ANY, 1264 bytes) 14/10/21 17:53:54 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 2) on executor ip-.internal: java.lang.IllegalStateException (unread block data) [duplicate 2] 14/10/21 17:53:54 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 3, ip-.internal, ANY, 1264 bytes) 14/10/21 17:53:54 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 3) on executor ip-2.internal: java.lang.IllegalStateException (unread block data) [duplicate 3] 14/10/21 17:53:54 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job 14/10/21 17:53:54 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/10/21 17:53:54 INFO TaskSchedulerImpl: Cancelling stage 0 14/10/21 17:53:54 INFO DAGScheduler: Failed to run first at SerDeUtil.scala:70 Traceback (most recent call last): File "/root/workspace/test/sparkhbase.py", line 17, in <module> conf=conf2) File "/root/spark/python/pyspark/context.py", line 471, in newAPIHadoopRDD jconf, batchSize) File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__ File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, ip-internal): java.lang.IllegalStateException: unread block data java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/stage-failure-Task-0-in-stage-0-0-failed-4-times-tp16928.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org