Thanks for the followup. I have a few follow on questions:
In the case of using decommission, any idea of what happens when we get to the last node in the old data center? Do you think it will decommission properly? I agree that this sounds like the easiest method. We have to see if we can support the storage requirement as we go down the cluster and decommission. In the case of changing the RF and dropping the entire old cluster here's what I was thinking: We change the RF to 4 which I take as meaning that there will be two copies of data in each cluster. So, if we just turn off all the nodes in the old data center then we still have two copies of all data in the new data center and then we can rebuild and cleanup things with nodetool to get to a normal state. We would then turn down the RF to 3 and rebuild in order to get back to our original config. The reason I thought this would work is that since RackAware alternates replica placement and we have inserted the new data center nodes in between the old key ranges evenly, a pair of nodes in the new DC would each get a replica of the data. That would give us some redundancy until we can rebuild. I am probably making a bad assumption about the RackAwareStrategy that blocks this. If so, it'd be nice if you could explain it to me. If you have another idea that might be worth discussing I'd appreciate it. Thanks, Jake On Thu, Dec 2, 2010 at 6:11 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > On Thu, Dec 2, 2010 at 4:08 AM, Jake Maizel <j...@soundcloud.com> wrote: >> Hello, >> >> We have a ring of 12 nodes with 6 in one data center and 6 in another. >> We want to shutdown all 6 nodes in data center 1 in order to close >> it down. We are using a replication factor of 3 and are using >> RackAwareStrategy with version 0.6.6. >> >> We have been thinking that using decomission on each of the nodes in >> the old data center one at a time would do the trick. Does this sound >> reasonable? > > That is the simplest approach. The major downside is that > RackAwareStrategy guarantees you will have at least one copy of _each_ > row in both DCs, so when you are down to 1 node in dc1 it will have a > copy of all the data. If you have a small enough data volume to make > this feasible then that is the option I would go with. > >> We have also been considering increasing the replication factor to 4 >> and then just shutting down all the old nodes. Would that work as far >> as data availability would go? > > Not sure what you are thinking of there, but probably not. :) > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > -- Jake Maizel Network Operations Soundcloud Mail & GTalk: j...@soundcloud.com Skype: jakecloud Rosenthaler strasse 13, 101 19, Berlin, DE