Reading Sandy's blog, there seems to be one typo.
bq. Similarly, the heap size can be controlled with the --executor-cores flag
or thespark.executor.memory property.
'--executor-memory' should be the right flag.
BTW
bq. It defaults to max(384, .07 * spark.executor.memory)
Default memory overhead
To Akhil's point, see Tuning Data structures. Avoid standard collection hashmap.
With fewer machines, try running 4 or 5 cores per executor and only
3-4 executors (1 per node):
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/.
Ought to reduce shuffle performance hit
Go through this once, if you haven't read it already.
https://spark.apache.org/docs/latest/tuning.html
Thanks
Best Regards
On Sat, Mar 28, 2015 at 7:33 PM, nsareen wrote:
> Hi All,
>
> I'm facing performance issues with spark implementation, and was briefly
> investigating on WebUI logs, i noti