l. This seems to be general issues in JVM with very large heaps.
>>> I agree that the best workaround would be to keep the heap size below 32GB.
>>> Thanks guys!
>>>
>>> Mingyu
>>>
>>> From: Arun Ahuja
>>> Date: Monday, October 6, 2014 at 7
eap size below 32GB.
>> Thanks guys!
>>
>> Mingyu
>>
>> From: Arun Ahuja
>> Date: Monday, October 6, 2014 at 7:50 AM
>> To: Andrew Ash
>> Cc: Mingyu Kim , "user@spark.apache.org" <
>> user@spark.apache.org>, Dennis Lawler
gt; From: Arun Ahuja
> Date: Monday, October 6, 2014 at 7:50 AM
> To: Andrew Ash
> Cc: Mingyu Kim , "user@spark.apache.org" <
> user@spark.apache.org>, Dennis Lawler
> Subject: Re: Larger heap leads to perf degradation due to GC
>
> We have used the strategy
;
, Dennis Lawler
Subject: Re: Larger heap leads to perf degradation due to GC
We have used the strategy that you suggested, Andrew - using many workers
per machine and keeping the heaps small (< 20gb).
Using a large heap resulted in workers hanging or not responding (leading to
timeout
We have used the strategy that you suggested, Andrew - using many workers
per machine and keeping the heaps small (< 20gb).
Using a large heap resulted in workers hanging or not responding (leading
to timeouts). The same dataset/job for us will fail (most often due to
akka disassociated or fetch
Hi Mingyu,
Maybe we should be limiting our heaps to 32GB max and running multiple
workers per machine to avoid large GC issues.
For a 128GB memory, 32 core machine, this could look like:
SPARK_WORKER_INSTANCES=4
SPARK_WORKER_MEMORY=32
SPARK_WORKER_CORES=8
Are people running with large (32GB+) e