I've got a similar question. Would you be able to provide some rough guide
(even a range is fine) on the number of nodes, cores, and total amount of
RAM required?
Do you want to store 1 TB, 1 PB or far more?
- say 6 TB of data in parquet format on s3
Do you want to just read that data, retrieve
It really depends on your needs and your data.
Do you want to store 1 TB, 1 PB or far more? Do you want to just read that
data, retrieve it then do little work on it and then read it, have a complex
machine learning pipeline? Depending on the workload, the ratio between cores
and storage will
Hi
I would like to know the details of implementing a cluster.
What kind of machines one would require, how many nodes, number of cores etc.
thanks
rakesh