I am launching EC2 clusters using the spark-ec2 scripts. My understanding is that this configures spark to use the available resources. I can see that spark will use the available memory on larger istance types. However I have never seen spark running at more than 400% (using 100% on 4 cores) on machines with many more cores. Am I misunderstanding the docs? Is it just that high end ec2 instances get I/O starved when running spark? It would be strange if that consistently produced a 400% hard limit though.
thanks Daniel