Do we have any update on this thread? Has anyone met and solved similar problems before?
Any pointers will be greatly appreciated! Best, XianXing On Mon, Jun 15, 2015 at 11:48 PM, Jia Yu <[email protected]> wrote: > Hi Peng, > > I got exactly same error! My shuffle data is also very large. Have you > figured out a method to solve that? > > Thanks, > Jia > > On Fri, Apr 24, 2015 at 7:59 AM, Peng Cheng <[email protected]> wrote: > >> I'm deploying a Spark data processing job on an EC2 cluster, the job is >> small >> for the cluster (16 cores with 120G RAM in total), the largest RDD has >> only >> 76k+ rows. But heavily skewed in the middle (thus requires repartitioning) >> and each row has around 100k of data after serialization. The job always >> got >> stuck in repartitioning. Namely, the job will constantly get following >> errors and retries: >> >> org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output >> location for shuffle >> >> org.apache.spark.shuffle.FetchFailedException: Error in opening >> FileSegmentManagedBuffer >> >> org.apache.spark.shuffle.FetchFailedException: >> java.io.FileNotFoundException: /tmp/spark-... >> I've tried to identify the problem but it seems like both memory and disk >> consumption of the machine throwing these errors are below 50%. I've also >> tried different configurations, including: >> >> let driver/executor memory use 60% of total memory. >> let netty to priortize JVM shuffling buffer. >> increase shuffling streaming buffer to 128m. >> use KryoSerializer and max out all buffers >> increase shuffling memoryFraction to 0.4 >> But none of them works. The small job always trigger the same series of >> errors and max out retries (upt to 1000 times). How to troubleshoot this >> thing in such situation? >> >> Thanks a lot if you have any clue. >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/What-are-the-likely-causes-of-org-apache-spark-shuffle-MetadataFetchFailedException-Missing-an-outpu-tp22646.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >
