Re: Recommended cluster parameters

2017-04-30 Thread Zeming Yu
I've got a similar question. Would you be able to provide some rough guide (even a range is fine) on the number of nodes, cores, and total amount of RAM required? Do you want to store 1 TB, 1 PB or far more? - say 6 TB of data in parquet format on s3 Do you want to just read that data, retrieve

Re: Recommended cluster parameters

2017-04-30 Thread yohann jardin
It really depends on your needs and your data. Do you want to store 1 TB, 1 PB or far more? Do you want to just read that data, retrieve it then do little work on it and then read it, have a complex machine learning pipeline? Depending on the workload, the ratio between cores and storage will

Recommended cluster parameters

2017-04-30 Thread rakesh sharma
Hi I would like to know the details of implementing a cluster. What kind of machines one would require, how many nodes, number of cores etc. thanks rakesh