Re: which aws instance type for shuffle performance

Alexander Pivovarov Fri, 18 Dec 2015 19:58:13 -0800

Andrew, it's going to be 4 execotor jvms on each r3.8xlarge.

Rastan, you can run quick test using emr spark cluster on spot instances
and see what configuration works better. Without the tests it is all
speculation.
On Dec 18, 2015 1:53 PM, "Andrew Or" <and...@databricks.com> wrote:


> Hi Rastan,
>
> Unless you're using off-heap memory or starting multiple executors per
> machine, I would recommend the r3.2xlarge option, since you don't actually
> want gigantic heaps (100GB is more than enough). I've personally run Spark
> on a very large scale with r3.8xlarge instances, but I've been using
> off-heap, so much of the memory was actually not used.
>
> Yes, if a shuffle file exists locally Spark just reads from disk.
>
> -Andrew
>
> 2015-12-15 23:11 GMT-08:00 Rastan Boroujerdi <rast...@gmail.com>:
>
>> I'm trying to determine whether I should be using 10 r3.8xlarge or 40
>> r3.2xlarge. I'm mostly concerned with shuffle performance of the
>> application.
>>
>> If I go with r3.8xlarge I will need to configure 4 worker instances per
>> machine to keep the JVM size down. The worker instances will likely contend
>> with each other for network and disk I/O if they are on the same machine.
>> If I go with 40 r3.2xlarge I will be able to allocate a single worker
>> instance per box, allowing each worker instance to have its own dedicated
>> network and disk I/O.
>>
>> Since shuffle performance is heavily impacted by disk and network
>> throughput, it seems like going with 40 r3.2xlarge would be the better
>> configuration between the two. Is my analysis correct? Are there other
>> tradeoffs that I'm not taking into account? Does spark bypass the network
>> transfer and read straight from disk if worker instances are on the same
>> machine?
>>
>> Thanks,
>>
>> Rastan
>>
>
>

Re: which aws instance type for shuffle performance

Reply via email to