Re: OutOfMemory error with Spark ML 1.5 logreg example

Tóth Zoltán Wed, 09 Sep 2015 06:42:09 -0700

Thanks Zoltan.

So far I got to a full repro which works both in docker and on a bigger
real-world cluster. Also, the whole thing only happens in `cluster` mode.
I issued a ticket for it.
Any thoughts?


https://issues.apache.org/jira/browse/SPARK-10487


On Mon, Sep 7, 2015 at 7:59 PM, Zsolt Tóth <toth.zsolt....@gmail.com> wrote:

> Hi,
>
> I ran your example on Spark-1.4.1 and 1.5.0-rc3. It succeeds on 1.4.1 but
> throws the  OOM on 1.5.0.  Do any of you know which PR introduced this
> issue?
>
> Zsolt
>
>
> 2015-09-07 16:33 GMT+02:00 Zoltán Zvara <zoltan.zv...@gmail.com>:
>
>> Hey, I'd try to debug, profile ResolvedDataSource. As far as I know, your
>> write will be performed by the JVM.
>>
>> On Mon, Sep 7, 2015 at 4:11 PM Tóth Zoltán <t...@looper.hu> wrote:
>>
>>> Unfortunately I'm getting the same error:
>>> The other interesting things are that:
>>>  - the parquet files got actually written to HDFS (also with
>>> .write.parquet() )
>>>  - the application gets stuck in the RUNNING state for good even after
>>> the error is thrown
>>>
>>> 15/09/07 10:01:10 INFO spark.ContextCleaner: Cleaned accumulator 19
>>> 15/09/07 10:01:10 INFO spark.ContextCleaner: Cleaned accumulator 5
>>> 15/09/07 10:01:12 INFO spark.ContextCleaner: Cleaned accumulator 20
>>> Exception in thread "Thread-7"
>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>> UncaughtExceptionHandler in thread "Thread-7"
>>> Exception in thread "org.apache.hadoop.hdfs.PeerCache@4070d501"
>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>> UncaughtExceptionHandler in thread 
>>> "org.apache.hadoop.hdfs.PeerCache@4070d501"
>>> Exception in thread "LeaseRenewer:r...@docker.rapidminer.com:8020"
>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>> UncaughtExceptionHandler in thread 
>>> "LeaseRenewer:r...@docker.rapidminer.com:8020"
>>> Exception in thread "Reporter"
>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>> UncaughtExceptionHandler in thread "Reporter"
>>> Exception in thread "qtp2134582502-46"
>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>> UncaughtExceptionHandler in thread "qtp2134582502-46"
>>>
>>>
>>>
>>>
>>> On Mon, Sep 7, 2015 at 3:48 PM, boci <boci.b...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Can you try to using save method instead of write?
>>>>
>>>> ex: out_df.save("path","parquet")
>>>>
>>>> b0c1
>>>>
>>>>
>>>> ----------------------------------------------------------------------------------------------------------------------------------
>>>> Skype: boci13, Hangout: boci.b...@gmail.com
>>>>
>>>> On Mon, Sep 7, 2015 at 3:35 PM, Zoltán Tóth <zoltanct...@gmail.com>
>>>> wrote:
>>>>
>>>>> Aaand, the error! :)
>>>>>
>>>>> Exception in thread "org.apache.hadoop.hdfs.PeerCache@4e000abf"
>>>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>>>> UncaughtExceptionHandler in thread 
>>>>> "org.apache.hadoop.hdfs.PeerCache@4e000abf"
>>>>> Exception in thread "Thread-7"
>>>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>>>> UncaughtExceptionHandler in thread "Thread-7"
>>>>> Exception in thread "LeaseRenewer:r...@docker.rapidminer.com:8020"
>>>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>>>> UncaughtExceptionHandler in thread 
>>>>> "LeaseRenewer:r...@docker.rapidminer.com:8020"
>>>>> Exception in thread "Reporter"
>>>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>>>> UncaughtExceptionHandler in thread "Reporter"
>>>>> Exception in thread "qtp2115718813-47"
>>>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>>>> UncaughtExceptionHandler in thread "qtp2115718813-47"
>>>>>
>>>>> Exception: java.lang.OutOfMemoryError thrown from the 
>>>>> UncaughtExceptionHandler in thread "sparkDriver-scheduler-1"
>>>>>
>>>>> Log Type: stdout
>>>>>
>>>>> Log Upload Time: Mon Sep 07 09:03:01 -0400 2015
>>>>>
>>>>> Log Length: 986
>>>>>
>>>>> Traceback (most recent call last):
>>>>>   File "spark-ml.py", line 33, in <module>
>>>>>     out_df.write.parquet("/tmp/logparquet")
>>>>>   File 
>>>>> "/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/root/appcache/application_1441224592929_0022/container_1441224592929_0022_01_000001/pyspark.zip/pyspark/sql/readwriter.py",
>>>>>  line 422, in parquet
>>>>>   File 
>>>>> "/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/root/appcache/application_1441224592929_0022/container_1441224592929_0022_01_000001/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
>>>>>  line 538, in __call__
>>>>>   File 
>>>>> "/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/root/appcache/application_1441224592929_0022/container_1441224592929_0022_01_000001/pyspark.zip/pyspark/sql/utils.py",
>>>>>  line 36, in deco
>>>>>   File 
>>>>> "/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/root/appcache/application_1441224592929_0022/container_1441224592929_0022_01_000001/py4j-0.8.2.1-src.zip/py4j/protocol.py",
>>>>>  line 300, in get_return_value
>>>>> py4j.protocol.Py4JJavaError
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Sep 7, 2015 at 3:27 PM, Zoltán Tóth <zoltanct...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> When I execute the Spark ML Logisitc Regression example in pyspark I
>>>>>> run into an OutOfMemory exception. I'm wondering if any of you 
>>>>>> experienced
>>>>>> the same or has a hint about how to fix this.
>>>>>>
>>>>>> The interesting bit is that I only get the exception when I try to
>>>>>> write the result DataFrame into a file. If I only "print" any of the
>>>>>> results, it all works fine.
>>>>>>
>>>>>> My Setup:
>>>>>> Spark 1.5.0-SNAPSHOT built for Hadoop 2.6.0 (I'm working with the
>>>>>> latest nightly build)
>>>>>> Build flags: -Psparkr -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn
>>>>>> -DzincPort=3034
>>>>>>
>>>>>> I'm using the default resource setup
>>>>>> 15/09/07 08:49:04 INFO yarn.YarnAllocator: Will request 2 executor
>>>>>> containers, each with 1 cores and 1408 MB memory including 384 MB 
>>>>>> overhead
>>>>>> 15/09/07 08:49:04 INFO yarn.YarnAllocator: Container request (host:
>>>>>> Any, capability: <memory:1408, vCores:1>)
>>>>>> 15/09/07 08:49:04 INFO yarn.YarnAllocator: Container request (host:
>>>>>> Any, capability: <memory:1408, vCores:1>)
>>>>>>
>>>>>> The script I'm executing:
>>>>>> from pyspark import SparkContext, SparkConf
>>>>>> from pyspark.sql import SQLContext
>>>>>>
>>>>>> conf = SparkConf().setAppName("pysparktest")
>>>>>> sc = SparkContext(conf=conf)
>>>>>> sqlContext = SQLContext(sc)
>>>>>>
>>>>>> from pyspark.mllib.regression import LabeledPoint
>>>>>> from pyspark.mllib.linalg import Vector, Vectors
>>>>>>
>>>>>> training = sc.parallelize((
>>>>>>   LabeledPoint(1.0, Vectors.dense(0.0, 1.1, 0.1)),
>>>>>>   LabeledPoint(0.0, Vectors.dense(2.0, 1.0, -1.0)),
>>>>>>   LabeledPoint(0.0, Vectors.dense(2.0, 1.3, 1.0)),
>>>>>>   LabeledPoint(1.0, Vectors.dense(0.0, 1.2, -0.5))))
>>>>>>
>>>>>> training_df = training.toDF()
>>>>>>
>>>>>> from pyspark.ml.classification import LogisticRegression
>>>>>>
>>>>>> reg = LogisticRegression()
>>>>>>
>>>>>> reg.setMaxIter(10).setRegParam(0.01)
>>>>>> model = reg.fit(training.toDF())
>>>>>>
>>>>>> test = sc.parallelize((
>>>>>>   LabeledPoint(1.0, Vectors.dense(-1.0, 1.5, 1.3)),
>>>>>>   LabeledPoint(0.0, Vectors.dense(3.0, 2.0, -0.1)),
>>>>>>   LabeledPoint(1.0, Vectors.dense(0.0, 2.2, -1.5))))
>>>>>>
>>>>>> out_df = model.transform(test.toDF())
>>>>>>
>>>>>> out_df.write.parquet("/tmp/logparquet")
>>>>>>
>>>>>> And the command:
>>>>>> spark-submit --master yarn --deploy-mode cluster spark-ml.py
>>>>>>
>>>>>> Thanks,
>>>>>> z
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>

Re: OutOfMemory error with Spark ML 1.5 logreg example

Reply via email to