Hi Tomer,
Are you able to look in your NodeManager logs to see if the NodeManagers
are killing any executors for exceeding memory limits? If you observe
this, you can solve the problem by bumping up
spark.yarn.executor.memoryOverhead.
-Sandy
On Sun, Feb 1, 2015 at 5:28 AM, Tomer Benyamini wrot
Hi all,
I'm running spark 1.2.0 on a 20-node Yarn emr cluster. I've noticed that
whenever I'm running a heavy computation job in parallel to other jobs
running, I'm getting these kind of exceptions:
* [task-result-getter-2] INFO org.apache.spark.scheduler.TaskSetManager-
Lost task 820.0 in stage