Cluster sizing

2019-09-13 Thread Riccardo Ferrari
Hi list, Is there any documentation about how to approach cluster sizing. How do you approach a new deployment? Thanks,

Re: Cluster sizing for recommendations

2015-07-27 Thread Xiangrui Meng
Hi Danny, You might need to reduce the number of partitions (or set userBlocks and productBlocks directly in ALS). Using a large number of partitions increases shuffle size and memory requirement. If you have 16 x 16 = 256 cores. I would recommend 64 or 128 instead of 2048. model.recommendProduct

Cluster sizing for recommendations

2015-07-06 Thread Danny Yates
Hi, I'm having trouble building a recommender and would appreciate a few pointers. I have 350,000,000 events which are stored in roughly 500,000 S3 files and are formatted as semi-structured JSON. These events are not all relevant to making recommendations. My code is (roughly): case class Even

Re: Guidelines for Spark Cluster Sizing

2014-04-03 Thread Sonal Goyal
w.nubetech.co> >> >> <http://in.linkedin.com/in/sonalgoyal> >> >> >> >> >> On Fri, Mar 28, 2014 at 4:55 PM, Sonal Goyal wrote: >> >>> Hi, >>> >>> I am looking for any guidelines for Spark Cluster Sizing - are there any &g

Re: Guidelines for Spark Cluster Sizing

2014-04-03 Thread Prashant Sharma
Fri, Mar 28, 2014 at 4:55 PM, Sonal Goyal wrote: > >> Hi, >> >> I am looking for any guidelines for Spark Cluster Sizing - are there any >> best practices or links for estimating the cluster specifications based on >> input data size, transformations etc

Re: Guidelines for Spark Cluster Sizing

2014-04-03 Thread Sonal Goyal
onalgoyal> On Fri, Mar 28, 2014 at 4:55 PM, Sonal Goyal wrote: > Hi, > > I am looking for any guidelines for Spark Cluster Sizing - are there any > best practices or links for estimating the cluster specifications based on > input data size, transformations etc? > >

Guidelines for Spark Cluster Sizing

2014-03-28 Thread Sonal Goyal
Hi, I am looking for any guidelines for Spark Cluster Sizing - are there any best practices or links for estimating the cluster specifications based on input data size, transformations etc? Thanks in advance for helping out. Best Regards, Sonal Nube Technologies <http://www.nubetech.co>