Do we have any update on this thread? Has anyone met and solved similar
problems before?

Any pointers will be greatly appreciated!

Best,
XianXing

On Mon, Jun 15, 2015 at 11:48 PM, Jia Yu <[email protected]> wrote:

> Hi Peng,
>
> I got exactly same error! My shuffle data is also very large. Have you
> figured out a method to solve that?
>
> Thanks,
> Jia
>
> On Fri, Apr 24, 2015 at 7:59 AM, Peng Cheng <[email protected]> wrote:
>
>> I'm deploying a Spark data processing job on an EC2 cluster, the job is
>> small
>> for the cluster (16 cores with 120G RAM in total), the largest RDD has
>> only
>> 76k+ rows. But heavily skewed in the middle (thus requires repartitioning)
>> and each row has around 100k of data after serialization. The job always
>> got
>> stuck in repartitioning. Namely, the job will constantly get following
>> errors and retries:
>>
>> org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output
>> location for shuffle
>>
>> org.apache.spark.shuffle.FetchFailedException: Error in opening
>> FileSegmentManagedBuffer
>>
>> org.apache.spark.shuffle.FetchFailedException:
>> java.io.FileNotFoundException: /tmp/spark-...
>> I've tried to identify the problem but it seems like both memory and disk
>> consumption of the machine throwing these errors are below 50%. I've also
>> tried different configurations, including:
>>
>> let driver/executor memory use 60% of total memory.
>> let netty to priortize JVM shuffling buffer.
>> increase shuffling streaming buffer to 128m.
>> use KryoSerializer and max out all buffers
>> increase shuffling memoryFraction to 0.4
>> But none of them works. The small job always trigger the same series of
>> errors and max out retries (upt to 1000 times). How to troubleshoot this
>> thing in such situation?
>>
>> Thanks a lot if you have any clue.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/What-are-the-likely-causes-of-org-apache-spark-shuffle-MetadataFetchFailedException-Missing-an-outpu-tp22646.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>

Reply via email to