RE: My first experience with Spark

2015-02-05 Thread java8964
using more time.We plan to make spark coexist with Hadoop cluster, so be able to control its memory usage is important for us.Does spark need that much of memory?ThanksYong Date: Thu, 5 Feb 2015 15:36:48 -0800 Subject: Re: My first experience with Spark From: deborah.sie...@gmail.com To: java8

Re: My first experience with Spark

2015-02-05 Thread Deborah Siegel
Hi Yong, Have you tried increasing your level of parallelism? How many tasks are you getting in failing stage? 2-3 tasks per CPU core is recommended, though maybe you need more for your shuffle operation? You can configure spark.default.parallelism, or pass in a level of parallelism as second par

RE: My first experience with Spark

2015-02-05 Thread java8964
Finally I gave up after there are too many failed retry. >From the log in the worker side, it looks like failed with JVM OOM, as below: 15/02/05 17:02:03 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Driver Heartbeater,5,main]java.lang.OutOfMemoryError: Java heap s