Re: Ever increasing physical memory for a Spark Application in YARN

Nitin Goyal Tue, 03 May 2016 20:04:00 -0700

Hi Daniel,

I could indeed discover the problem in my case and it turned out to be a
bug at parquet side and I had raised and contributed to the following issue
:-


https://issues.apache.org/jira/browse/PARQUET-353

Hope this helps!

Thanks
-Nitin


On Mon, May 2, 2016 at 9:15 PM, Daniel Darabos <
daniel.dara...@lynxanalytics.com> wrote:

> Hi Nitin,
> Sorry for waking up this ancient thread. That's a fantastic set of JVM
> flags! We just hit the same problem, but we haven't even discovered all
> those flags for limiting memory growth. I wanted to ask if you ever
> discovered anything further?
>
> I see you also set -XX:NewRatio=3. This is a very important flag since
> Spark 1.6.0. With unified memory management with the default
> spark.memory.fraction=0.75 the cache will fill up 75% of the heap. The
> default NewRatio is 2, so the cache will not fit in the old generation
> pool, constantly triggering full GCs. With NewRatio=3 the old generation
> pool is 75% of the heap, so it (just) fits the cache. We find this makes a
> very significant performance difference in practice.
>
> Perhaps this should be documented somewhere. Or the default
> spark.memory.fraction should be 0.66, so that it works out with the default
> JVM flags.
>
> On Mon, Jul 27, 2015 at 6:08 PM, Nitin Goyal <nitin2go...@gmail.com>
> wrote:
>
>> I am running a spark application in YARN having 2 executors with Xms/Xmx
>> as
>> 32 Gigs and spark.yarn.excutor.memoryOverhead as 6 gigs.
>>
>> I am seeing that the app's physical memory is ever increasing and finally
>> gets killed by node manager
>>
>> 2015-07-25 15:07:05,354 WARN
>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
>> Container [pid=10508,containerID=container_1437828324746_0002_01_000003]
>> is
>> running beyond physical memory limits. Current usage: 38.0 GB of 38 GB
>> physical memory used; 39.5 GB of 152 GB virtual memory used. Killing
>> container.
>> Dump of the process-tree for container_1437828324746_0002_01_000003 :
>>     |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>>     |- 10508 9563 10508 10508 (bash) 0 0 9433088 314 /bin/bash -c
>> /usr/java/default/bin/java -server -XX:OnOutOfMemoryError='kill %p'
>> -Xms32768m -Xmx32768m  -Dlog4j.configuration=log4j-executor.properties
>> -XX:MetaspaceSize=512m -XX:+UseG1GC -XX:+PrintGCTimeStamps
>> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:gc.log
>> -XX:AdaptiveSizePolicyOutputInterval=1  -XX:+UseGCLogFileRotation
>> -XX:GCLogFileSize=500M -XX:NumberOfGCLogFiles=1
>> -XX:MaxDirectMemorySize=3500M -XX:NewRatio=3
>> -Dcom.sun.management.jmxremote
>> -Dcom.sun.management.jmxremote.port=36082
>> -Dcom.sun.management.jmxremote.authenticate=false
>> -Dcom.sun.management.jmxremote.ssl=false -XX:NativeMemoryTracking=detail
>> -XX:ReservedCodeCacheSize=100M -XX:MaxMetaspaceSize=512m
>> -XX:CompressedClassSpaceSize=256m
>>
>> -Djava.io.tmpdir=/data/yarn/datanode/nm-local-dir/usercache/admin/appcache/application_1437828324746_0002/container_1437828324746_0002_01_000003/tmp
>> '-Dspark.driver.port=43354'
>>
>> -Dspark.yarn.app.container.log.dir=/opt/hadoop/logs/userlogs/application_1437828324746_0002/container_1437828324746_0002_01_000003
>> org.apache.spark.executor.CoarseGrainedExecutorBackend
>> akka.tcp://sparkDriver@nn1:43354/user/CoarseGrainedScheduler 1 dn3 6
>> application_1437828324746_0002 1>
>>
>> /opt/hadoop/logs/userlogs/application_1437828324746_0002/container_1437828324746_0002_01_000003/stdout
>> 2>
>>
>> /opt/hadoop/logs/userlogs/application_1437828324746_0002/container_1437828324746_0002_01_000003/stderr
>>
>>
>> I diabled YARN's parameter "yarn.nodemanager.pmem-check-enabled" and
>> noticed
>> that physical memory usage went till 40 gigs
>>
>> I checked the total RSS in /proc/pid/smaps and it was same value as
>> physical
>> memory reported by Yarn and seen in top command.
>>
>> I checked that its not a problem with the heap but something is increasing
>> in off heap/ native memory. I used tools like Visual VM but didn't find
>> anything that's increasing there. MaxDirectMmeory also didn't exceed
>> 600MB.
>> Peak number of active threads was 70-80 and thread stack size didn't
>> exceed
>> 100MB. MetaspaceSize was around 60-70MB.
>>
>> FYI, I am on Spark 1.2 and Hadoop 2.4.0 and my spark application is based
>> on
>> Spark SQL and it's an HDFS read/write intensive application and caches
>> data
>> in Spark SQL's in-memory caching
>>
>> Any help would be highly appreciated. Or any hint that where should I look
>> to debug memory leak or if any tool already there. Let me know if any
>> other
>> information is needed.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/Ever-increasing-physical-memory-for-a-Spark-Application-in-YARN-tp13446.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>


-- 
Regards
Nitin Goyal

Re: Ever increasing physical memory for a Spark Application in YARN

Reply via email to