Thanks Xiangrui. After some debugging efforts, it turns out that the
problem results from a bug in my code. But it's good to know that a long
lineage could also lead to this problem. I will also try checkpointing to
see whether the performance can be improved.

Best regards,
- Guanhua

On 5/13/14 12:10 AM, "Xiangrui Meng" <men...@gmail.com> wrote:

>You have a long lineage that causes the StackOverflow error. Try
>rdd.checkPoint() and rdd.count() for every 20~30 iterations.
>checkPoint can cut the lineage. -Xiangrui
>
>On Mon, May 12, 2014 at 3:42 PM, Guanhua Yan <gh...@lanl.gov> wrote:
>> Dear Sparkers:
>>
>> I am using Python spark of version 0.9.0 to implement some iterative
>> algorithm. I got some errors shown at the end of this email. It seems
>>that
>> it's due to the Java Stack Overflow error. The same error has been
>> duplicated on a mac desktop and a linux workstation, both running the
>>same
>> version of Spark.
>>
>> The same line of code works correctly after quite some iterations. At
>>the
>> line of error, rdd__new.count() could be 0. (In some previous rounds,
>>this
>> was also 0 without any problem).
>>
>> Any thoughts on this?
>>
>> Thank you very much,
>> - Guanhua
>>
>>
>> ========================================
>> CODE:    print "round", round, rdd__new.count()
>> ========================================
>>   File
>> 
>>"/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/
>>rdd.py",
>> line 542, in count
>> 14/05/12 16:20:28 INFO TaskSetManager: Loss was due to
>> java.lang.StackOverflowError [duplicate 1]
>>     return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
>> 14/05/12 16:20:28 ERROR TaskSetManager: Task 8419.0:0 failed 1 times;
>> aborting job
>>   File
>> 
>>"/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/
>>rdd.py",
>> line 533, in sum
>> 14/05/12 16:20:28 INFO TaskSchedulerImpl: Ignoring update with state
>>FAILED
>> from TID 1774 because its task set is gone
>>     return self.mapPartitions(lambda x: [sum(x)]).reduce(operator.add)
>>   File
>> 
>>"/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/
>>rdd.py",
>> line 499, in reduce
>>     vals = self.mapPartitions(func).collect()
>>   File
>> 
>>"/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/
>>rdd.py",
>> line 463, in collect
>>     bytesInJava = self._jrdd.collect().iterator()
>>   File
>> 
>>"/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/lib/py4j
>>-0.8.1-src.zip/py4j/java_gateway.py",
>> line 537, in __call__
>>   File
>> 
>>"/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/lib/py4j
>>-0.8.1-src.zip/py4j/protocol.py",
>> line 300, in get_return_value
>> py4j.protocol.Py4JJavaError: An error occurred while calling
>>o4317.collect.
>> : org.apache.spark.SparkException: Job aborted: Task 8419.0:1 failed 1
>>times
>> (most recent failure: Exception failure: java.lang.StackOverflowError)
>> at
>> 
>>org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$schedul
>>er$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)
>> at
>> 
>>org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$schedul
>>er$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026)
>> at
>> 
>>scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scal
>>a:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> at
>> 
>>org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGSch
>>eduler$$abortStage(DAGScheduler.scala:1026)
>> at
>> 
>>org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DA
>>GScheduler.scala:619)
>> at
>> 
>>org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DA
>>GScheduler.scala:619)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> 
>>org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:6
>>19)
>> at
>> 
>>org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun
>>$receive$1.applyOrElse(DAGScheduler.scala:207)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>> at
>> 
>>akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(Abstract
>>Dispatcher.scala:386)
>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>> 
>>scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.jav
>>a:1339)
>> at 
>>scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at
>> 
>>scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.j
>>ava:107)
>>
>> ======================================
>> The stack overflow error is shown as follows:
>> ======================================
>>
>> 14/05/12 16:20:28 ERROR Executor: Exception in task ID 1774
>> java.lang.StackOverflowError
>> at java.util.zip.Inflater.inflate(Inflater.java:259)
>> at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>> at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116)
>> at
>> 
>>java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:231
>>0)
>> at
>> 
>>java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.jav
>>a:2323)
>> at
>> 
>>java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.
>>java:2818)
>> at java.io.ObjectInputStream.readHandle(ObjectInputStream.java:1452)
>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1511)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at 
>>java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>> at 
>>java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at 
>>java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at 
>>java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>> at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
>> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>> at
>> 
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>>mpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at 
>>java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at 
>>java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at 
>>java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>> at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
>> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>> at
>> 
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>>mpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at 
>>java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at 
>>java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at 
>>java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> at 
>>java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>> at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
>> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>>  The above replicated many times after this Š
>> ======================================


Reply via email to