Thanks Edward. So would you say this is a good strategy: 1. snapshot files from production cluster 2. move snapshot files to analysis cluster in a one-to-one node fashion (the system/LocationInfo* sstables could be excluded here but I'm moving them all because the transfer is also part of our DR strategy) 3. delete LocationInfo* sstables in system keyspace on each node in analysis cluster 4. configure analysis cluster nodes to have a different cluster name 5. set initial token for each analysis node (in cassandra.yaml) to the token claimed by the corresponding production cluster node 6. start cassandra (or brisk really) on each analysis node to create separate cluster
Any reason that procedure wouldn't work? On Sun, Oct 2, 2011 at 3:14 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > > > On Sun, Oct 2, 2011 at 4:25 PM, Eric Czech <e...@nextbigsound.com> wrote: > >> We're exploring a data processing procedure where we snapshot our >> production cluster data and move that data to a new cluster for analysis but >> I'm having some strange issues where the analysis cluster is still somehow >> aware of the production cluster (i.e. the production cluster ring is trying >> to include nodes from the other cluster with the same token). >> >> The seed addresses in cassandra.yaml definitely prohibit this type of >> intersection between the two clusters so I'm guessing that it has something >> to do with the information in the system sstables. >> >> Is there anyway to duplicate raw sstables in an effort to "copy" a cluster >> such that the copied cluster has a different name? I know this usually >> results in a "saved cluster name X != Y" sort of error but it looks like we >> need to find some sort of way to do this logical separation. >> >> Any help would be much appreciated! >> >> Thanks. >> > > Cassandra stores information about the cluster topology in the system > table. This is stored in the LocationInfo column family. If you set > AutoBootstrap to false, assign the Initial Token correctly and wipe the > LocationInfo column family. Cassandra will have no memory of the topology. > (You can also wipe the entire system keyspace but then you have to reinstall > the schema) >