On Fri, May 15, 2015 at 5:09 PM, Thomas Gerber
wrote:
> Now, we noticed that we get java heap OOM exceptions on the output tracker
> when we have too many tasks. I wonder:
> 1. where does the map output tracker live? The driver? The master (when
> those are not the same)?
> 2. how can we increase
gt;> On Wed, Mar 4, 2015 at 8:15 AM, Thomas Gerber
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> We are using spark 1.2.1 on a very large cluster (100 c3.8xlarge
>>>> workers). We use spark-submit to start an application.
>>>>
>&
:
>>>
>>> Job aborted due to stage failure: Task 3095 in stage 140.0 failed 4 times,
>>> most recent failure: Lost task 3095.3 in stage 140.0 (TID 308697,
>>> ip-10-0-12-88.ec2.internal): org.apache.spark.SparkException: Error
>>> communicating w
t recent failure: Lost task 3095.3 in stage 140.0 (TID 308697,
>> ip-10-0-12-88.ec2.internal): org.apache.spark.SparkException: Error
>> communicating with MapOutputTracker
>>
>>
>> We tried the whole application again, and it failed on the same stage
>> (but it got
led stage:
>
> Job aborted due to stage failure: Task 3095 in stage 140.0 failed 4 times,
> most recent failure: Lost task 3095.3 in stage 140.0 (TID 308697,
> ip-10-0-12-88.ec2.internal): org.apache.spark.SparkException: Error
> communicating with MapOutputTracker
>
>
> We
3095.3 in stage 140.0 (TID
308697, ip-10-0-12-88.ec2.internal): org.apache.spark.SparkException:
Error communicating with MapOutputTracker
We tried the whole application again, and it failed on the same stage (but
it got more tasks completed on that stage) with the same error.
We then looked at
: "org.apache.spark"
%% "spark-core" % "1.1.0-SNAPSHOT" % "provided"
One exception I get is:
Error communicating with MapOutputTracker
org.apache.spark.SparkException: Error communicating with MapOutputTracker
How can I fix this?
Found a thread on this