Re: Larger heap leads to perf degradation due to GC

Arun Ahuja Mon, 06 Oct 2014 07:51:41 -0700

We have used the strategy that you suggested, Andrew - using many workers
per machine and keeping the heaps small (< 20gb).

Using a large heap resulted in workers hanging or not responding (leading
to timeouts).  The same dataset/job for us will fail (most often due to
akka disassociated or fetch failures errors) with 10 cores / 100 executors,
60 gb per executor while succceed with 1 core / 1000 executors / 6gb per
executor.

When the job does succceed with more cores per executor and larger heap it
is usually much slower than the smaller executors (the same 8-10 min job
taking 15-20 min to complete)

The unfortunate downside of this has been, we have had some large broadcast
variables which may not fit into memory (and unnecessarily duplicated) when
using the smaller executors.

Most of this is anecdotal but for the most part we have had more success
and consistency with more executors with smaller memory requirements.

On Sun, Oct 5, 2014 at 7:20 PM, Andrew Ash <and...@andrewash.com> wrote:

> Hi Mingyu,
>
> Maybe we should be limiting our heaps to 32GB max and running multiple
> workers per machine to avoid large GC issues.
>
> For a 128GB memory, 32 core machine, this could look like:
>
> SPARK_WORKER_INSTANCES=4
> SPARK_WORKER_MEMORY=32
> SPARK_WORKER_CORES=8
>
> Are people running with large (32GB+) executor heaps in production?  I'd
> be curious to hear if so.
>
> Cheers!
> Andrew
>
> On Thu, Oct 2, 2014 at 1:30 PM, Mingyu Kim <m...@palantir.com> wrote:
>
>> This issue definitely needs more investigation, but I just wanted to
>> quickly check if anyone has run into this problem or has general guidance
>> around it. We’ve seen a performance degradation with a large heap on a
>> simple map task (I.e. No shuffle). We’ve seen the slowness starting around
>> from 50GB heap. (I.e. spark.executor.memoty=50g) And, when we checked the
>> CPU usage, there were just a lot of GCs going on.
>>
>> Has anyone seen a similar problem?
>>
>> Thanks,
>> Mingyu
>>
>
>

Re: Larger heap leads to perf degradation due to GC

Reply via email to