Hi,

Correct me if I am wrong, but I have not heard of such a guideline may be
because it is actually very dynamic and depends on a lot factors. Most
important factor is kind of workload, some workloads can benefit a lot from
large memory and some don't. So its not just input data size its also how
you are processing it. On top of that these guidelines will be different
for cloud services and dedicated clusters. So unless we are endorsing a
cloud platform they are not going to be exactly reproducible. Theoretically
a few vertical nodes will do better than a lot of smaller instances for a
memory friendly workload.

I think it would be good to post experiences and then that can eventually
become some sort of guidelines.


Prashant Sharma


On Thu, Apr 3, 2014 at 1:36 PM, Sonal Goyal <sonalgoy...@gmail.com> wrote:

> Hi,
>
> My earlier email did not get any response, I am looking for some
> guidelines for sizing a spark cluster. Please let me know if there are any
> best practices or rules of thumb. Thanks a lot.
>
> Best Regards,
> Sonal
> Nube Technologies <http://www.nubetech.co>
>
>  <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
> On Fri, Mar 28, 2014 at 4:55 PM, Sonal Goyal <sonalgoy...@gmail.com>wrote:
>
>> Hi,
>>
>> I am looking for any guidelines for Spark Cluster Sizing - are there any
>> best practices or links for estimating the cluster specifications based on
>> input data size, transformations etc?
>>
>> Thanks in advance for helping out.
>>
>> Best Regards,
>> Sonal
>> Nube Technologies <http://www.nubetech.co>
>>
>> <http://in.linkedin.com/in/sonalgoyal>
>>
>>
>>
>

Reply via email to