You will likely need to rename some of the files to avoid collisions (they are only unique per node). Otherwise, yes, this can work.
On Thu, Sep 2, 2010 at 11:09 AM, Anthony Molinaro <antho...@alumni.caltech.edu> wrote: > Hi, > > We're running cassandra 0.6.4, and need to do a data center move of > a cluster (from EC2 to our own data center). Because of the way the > networks are set up we can't actually connect these boxes directly, so > the original plan of add some nodes in the new colo, let them bootstrap > then decommission nodes in the old colo until the data is all transfered > will not work. > > So I'm wondering if the following will work > > 1. take a snapshot on the source cluster > 2. rsync all the files from the old machines to the new machines (we'd most > likely be reducing the total number of machines, so would do things like > take 4-5 machines worth of data and put it onto 1 machine) > 3. bring up the new machines in the new colo > 4. run cleanup on all new nodes? > 5. run repair on all new nodes? > > So will this work? If so, are steps 4 and 5 correct? > > I realize we will miss any new data that happens between the snapshot > and turning on writes on the new cluster, but I think we might be able > to just tune compaction such that it doesn't happen, then just sync > the files that change while the data transfers happen? > > Thanks, > > -Anthony > > -- > ------------------------------------------------------------------------ > Anthony Molinaro <antho...@alumni.caltech.edu> >