Everything that I've read about data centers focuses on setting things up at the beginning of time.

I've the the following situation:

10 machines in a datacenter (DC1), with replication factor of 2.

I want to set up a second data center (DC2) with the following configuration:
  20 machines with a replication factor of 4

What I've found is that if I initially start adding things, the first machine to join the network attempts to replicate all of the data from DC1 and fills up it's disk drive. I've played with setting the storage_options to have a replication factor of 0, then I can bring up all 20 machines in DC2 but then start getting a huge number of read errors from read on DC1.

Is there a simple cookbook on how to add a second DC? I'm currently trying to set the replication factor to 1 and do a repair, but that doesn't feel like the right approach.

Thanks,



Reply via email to