Hi Rastan, Unless you're using off-heap memory or starting multiple executors per machine, I would recommend the r3.2xlarge option, since you don't actually want gigantic heaps (100GB is more than enough). I've personally run Spark on a very large scale with r3.8xlarge instances, but I've been using off-heap, so much of the memory was actually not used.
Yes, if a shuffle file exists locally Spark just reads from disk. -Andrew 2015-12-15 23:11 GMT-08:00 Rastan Boroujerdi <rast...@gmail.com>: > I'm trying to determine whether I should be using 10 r3.8xlarge or 40 > r3.2xlarge. I'm mostly concerned with shuffle performance of the > application. > > If I go with r3.8xlarge I will need to configure 4 worker instances per > machine to keep the JVM size down. The worker instances will likely contend > with each other for network and disk I/O if they are on the same machine. > If I go with 40 r3.2xlarge I will be able to allocate a single worker > instance per box, allowing each worker instance to have its own dedicated > network and disk I/O. > > Since shuffle performance is heavily impacted by disk and network > throughput, it seems like going with 40 r3.2xlarge would be the better > configuration between the two. Is my analysis correct? Are there other > tradeoffs that I'm not taking into account? Does spark bypass the network > transfer and read straight from disk if worker instances are on the same > machine? > > Thanks, > > Rastan >