Re: java.lang.OutOfMemoryError: Java heap space

2014-07-31 Thread Haiyang Fu
http://spark.apache.org/docs/latest/tuning.html#level-of-parallelism On Fri, Aug 1, 2014 at 1:29 PM, Haiyang Fu wrote: > Hi, > here are two tips for you, > 1. increase the parallism level > 2.increase the driver memory > > > On Fri, Aug 1, 2014 at 12:58 AM, Sameer

Re: java.lang.OutOfMemoryError: Java heap space

2014-07-31 Thread Haiyang Fu
Hi, here are two tips for you, 1. increase the parallism level 2.increase the driver memory On Fri, Aug 1, 2014 at 12:58 AM, Sameer Tilak wrote: > Hi everyone, > I have the following configuration. I am currently running my app in local > mode. > > val conf = new > SparkConf().setMaster("loca

Re: Re: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]

2014-07-31 Thread Haiyang Fu
Glad to help you On Fri, Aug 1, 2014 at 11:28 AM, Bin wrote: > Hi Haiyang, > > Thanks, it really is the reason. > > Best, > Bin > > > 在 2014-07-31 08:05:34,"Haiyang Fu" 写道: > > Have you tried to increase the dirver memory? > > > On T

Re: Ports required for running spark

2014-07-31 Thread Haiyang Fu
es over the resource management, bot I constantly > got Exception ConnectionRefused on mentioned port. So, I suppose some spark > internal communications are done via this port... but I don't know what > exactly and how can I change it... > > Thank you, > Konstantin Kudryavtsev >

Re: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]

2014-07-31 Thread Haiyang Fu
Have you tried to increase the dirver memory? On Thu, Jul 31, 2014 at 3:54 PM, Bin wrote: > Hi All, > > The data size of my task is about 30mb. It runs smoothly in local mode. > However, when I submit it to the cluster, it throws the titled error > (Please see below for the complete output). >

Re: Ports required for running spark

2014-07-31 Thread Haiyang Fu
Hi Konstantin, Would you please post some more details? Error info or exception from the log on what situation?when you run spark job on yarn cluster mode ,yarn will take over all the resource management. On Thu, Jul 31, 2014 at 6:17 PM, Konstantin Kudryavtsev < kudryavtsev.konstan...@gmail.com>

Re: Spark partition

2014-07-30 Thread Haiyang Fu
Hi, you may referer this http://spark.apache.org/docs/latest/tuning.html#level-of-parallelism and http://spark.apache.org/docs/latest/programming-guide.html#parallelized-collections ,both of which are about the RDD partitions.As you are going to load data from hdfs, so you maybe also need to know h

Re: How to specify the job to run on the specific nodes(machines) in the hadoop yarn cluster?

2014-07-29 Thread Haiyang Fu
It's really a good question !I'm also working on it On Wed, Jul 30, 2014 at 11:45 AM, adu wrote: > Hi all, > RT. I want to run a job on specific two nodes in the cluster? How to > configure the yarn? Dose yarn queue help? > > Thanks > > >