Hi Raj,

Since the number of executor cores is equivalent to the number of tasks
that can be executed in parallel in the executor, in effect, the 6G
executor memory configured for an executor is being shared by 6 tasks plus
factoring in the memory allocation for caching & task execution. I would
suggest increasing the executor-memory and also adjusting it if you're
going to increase the number of executor cores.

You might also want to adjust the memory allocation for caching and task
execution, via the spark.storage.memoryFraction config. By default, it's
configured to 0.6 (60% of the memory is allocated for the cache). Lowering
it to a smaller fraction, say 0.4 or 0.3, would give you more available
memory for task executions.

Hope this helps!

Thanks,
Deng

On Tue, Jun 16, 2015 at 3:09 AM, diplomatic Guru <[email protected]>
wrote:

> Hello All,
>
>
> I have a Spark job that throws "java.lang.OutOfMemoryError: GC overhead
> limit exceeded".
>
> The job is trying to process a filesize 4.5G.
>
> I've tried following spark configuration:
>
> --num-executors 6  --executor-memory 6G --executor-cores 6 --driver-memory 3G
>
> I tried increasing more cores and executors which sometime works, but
> takes over 20 minutes to process the file.
>
> Could I do something to improve the performance? or stop the Java Heap
> issue?
>
>
> Thank you.
>
>
> Best regards,
>
>
> Raj.
>

Reply via email to