Re: Running "beyond memory limits" in ConnectedComponents

2015-01-15 Thread Sean Owen
If you give the executor 22GB, it will run with "... -Xmx22g". If the JVM heap gets nearly full, it will almost certainly consume more than 22GB of physical memory, because the JVM needs memory for more than just heap. But in this scenario YARN was only asked for 22GB and it gets killed. This is ex

Re: Running "beyond memory limits" in ConnectedComponents

2015-01-15 Thread Nitin kak
Replying to all Is this "Overhead memory" allocation used for any specific purpose. For example, will it be any different if I do *"--executor-memory 22G" *with overhead set to 0%(hypothetically) vs "*--executor-memory 20G*" and overhead memory to default(9%) which eventually brings the total

Re: Running "beyond memory limits" in ConnectedComponents

2015-01-15 Thread Sean Owen
This is a YARN setting. It just controls how much any container can reserve, including Spark executors. That is not the problem. You need Spark to ask for more memory from YARN, on top of the memory that is requested by --executor-memory. Your output indicates the default of 7% is too little. For

Re: Running "beyond memory limits" in ConnectedComponents

2015-01-15 Thread Nitin kak
I am sorry for the formatting error, the value for *yarn.scheduler.maximum-allocation-mb = 28G* On Thu, Jan 15, 2015 at 11:31 AM, Nitin kak wrote: > Thanks for sticking to this thread. > > I am guessing what memory my app requests and what Yarn requests on my > part should be same and is determi

Re: Running "beyond memory limits" in ConnectedComponents

2015-01-15 Thread Nitin kak
Thanks for sticking to this thread. I am guessing what memory my app requests and what Yarn requests on my part should be same and is determined by the value of *--executor-memory* which I had set to *20G*. Or can the two values be different? I checked in Yarn configurations(below), so I think th

Re: Running "beyond memory limits" in ConnectedComponents

2015-01-15 Thread Sean Owen
Those settings aren't relevant, I think. You're concerned with what your app requests, and what Spark requests of YARN on your behalf. (Of course, you can't request more than what your cluster allows for a YARN container for example, but that doesn't seem to be what is happening here.) You do not

Re: Running "beyond memory limits" in ConnectedComponents

2015-01-14 Thread Nitin kak
Thanks Sean. I guess Cloudera Manager has parameters executor_total_max_heapsize and worker_max_heapsize which point to the parameters you mentioned above. How much should that cushon between the jvm heap size and yarn memory limit be? I tried setting jvm memory to 20g and yarn to 24g, but it ga

Re: Running "beyond memory limits" in ConnectedComponents

2015-01-14 Thread Sean Owen
That's not quite what that error means. Spark is not out of memory. It means that Spark is using more memory than it asked YARN for. That in turn is because the default amount of cushion established between the YARN allowed container size and the JVM heap size is too small. See spark.yarn.executor.