If I have a database that partitions naturally into non-overlapping
datasets, in which there are no references between datasets, where each
dataset is quite large (i.e. large enough to merit its own cluster from the
point of view of quantity of data), should I set up one cluster per database
or one large cluster for everything together?

As I see it:

The primary advantage of separate clusters is total isolation: if I have a
problem with one dataset, my application will continue working normally for
all other datasets.

The primary advantage of one big cluster is usage pooling: when one server
goes down in a large cluster it's much less important than when one server
goes down in a small cluster. Also, different temporal usage patterns of the
different datasets (i.e. there will be different peak hours on different
datasets) can be combined to ease capacity requirements.

Any thoughts?

Reply via email to