Your executors are going out of memory & then subsequent tasks scheduled on
the scheduler are also failing, hence the lost tid(task id).


Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Mon, Jun 30, 2014 at 7:47 PM, Sguj <tpcome...@yahoo.com> wrote:

> I'm trying to perform operations on a large RDD, that ends up being about
> 1.3
> GB in memory when loaded in. It's being cached in memory during the first
> operation, but when another task begins that uses the RDD, I'm getting this
> error that says the RDD was lost:
>
> 14/06/30 09:48:17 INFO TaskSetManager: Serialized task 1.0:4 as 8245 bytes
> in 0 ms
> 14/06/30 09:48:17 WARN TaskSetManager: Lost TID 15611 (task 1.0:3)
> 14/06/30 09:48:17 WARN TaskSetManager: Loss was due to
> org.apache.spark.api.python.PythonException
> org.apache.spark.api.python.PythonException: Traceback (most recent call
> last):
>   File "/Users/me/Desktop/spark-1.0.0/python/pyspark/worker.py", line 73,
> in
> main
>     command = pickleSer._read_with_length(infile)
>   File "/Users/me/Desktop/spark-1.0.0/python/pyspark/serializers.py", line
> 142, in _read_with_length
>     length = read_int(stream)
>   File "/Users/me/Desktop/spark-1.0.0/python/pyspark/serializers.py", line
> 337, in read_int
>     raise EOFError
> EOFError
>
>         at
> org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:115)
>         at
> org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:145)
>         at
> org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:78)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>         at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>         at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>         at
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>         at org.apache.spark.scheduler.Task.run(Task.scala:51)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 14/06/30 09:48:18 INFO AppClient$ClientActor: Executor updated:
> app-20140630090515-0000/0 is now FAILED (Command exited with code 52)
> 14/06/30 09:48:18 INFO SparkDeploySchedulerBackend: Executor
> app-20140630090515-0000/0 removed: Command exited with code 52
> 14/06/30 09:48:18 INFO SparkDeploySchedulerBackend: Executor 0
> disconnected,
> so removing it
> 14/06/30 09:48:18 ERROR TaskSchedulerImpl: Lost executor 0 on localhost:
> OutOfMemoryError
> 14/06/30 09:48:18 INFO TaskSetManager: Re-queueing tasks for 0 from TaskSet
> 1.0
> 14/06/30 09:48:18 WARN TaskSetManager: Lost TID 15610 (task 1.0:2)
> 14/06/30 09:48:18 WARN TaskSetManager: Lost TID 15609 (task 1.0:1)
> 14/06/30 09:48:18 WARN TaskSetManager: Lost TID 15612 (task 1.0:4)
> 14/06/30 09:48:18 WARN TaskSetManager: Lost TID 15608 (task 1.0:0)
>
>
> The operation it fails on is a ReduceByKey(), and the RDD before the
> operation is split into several thousand partitions (I'm doing term
> weighting that requires a different partition initially for each document),
> and the system has 6 GB of memory for the executor, so I'm not sure if it's
> actually a memory error, as is mentioned 5 lines from the end of the error.
> The serializer error portion is what's really confusing me, and I can't
> find
> references to this particular error with Spark anywhere.
>
> Does anyone have a clue as to what the actual error might be here, and what
> a possible solution would be?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Serializer-or-Out-of-Memory-issues-tp8533.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to