Andrew, it's going to be 4 execotor jvms on each r3.8xlarge.
Rastan, you can run quick test using emr spark cluster on spot instances
and see what configuration works better. Without the tests it is all
speculation.
On Dec 18, 2015 1:53 PM, "Andrew Or" wrote:
> Hi Rastan,
>
> Unless you're using
Hi Rastan,
Unless you're using off-heap memory or starting multiple executors per
machine, I would recommend the r3.2xlarge option, since you don't actually
want gigantic heaps (100GB is more than enough). I've personally run Spark
on a very large scale with r3.8xlarge instances, but I've been usi