>>>>> "Eric" == Eric Czech <e...@nextbigsound.com> writes:
Eric> We're exploring a data processing procedure where we snapshot Eric> our production cluster data and move that data to a new Eric> cluster for analysis but I'm having some strange issues where Eric> the analysis cluster is still somehow aware of the production Eric> cluster (i.e. the production cluster ring is trying to include Eric> nodes from the other cluster with the same token). Are you using the same cluster name in for both clusters? If so, I would suggest you don't. Eric> The seed addresses in cassandra.yaml definitely prohibit this Eric> type of intersection between the two clusters so I'm guessing Eric> that it has something to do with the information in the system Eric> sstables. I'm sure you will get a more knowledgeable answer from people who have been doing this for a while: but I have to ask are copying over the LocationInfo* SSTables from the snapshot to the analysis cluster? The LocationInfo CF can record the endpoints in your production cluster. >From the little I've read of the code (StorageService.java and SystemTable.java) it is possible (likely?) that endpoints from your production cluster will get added to your analysis cluster's Gossiper on startup. If you are using the same cluster name, well, there you have it..... Eric> Is there anyway to duplicate raw sstables in an effort to Eric> "copy" a cluster such that the copied cluster has a different Eric> name? I know this usually results in a "saved cluster name X Eric> != Y" sort of error but it looks like we need to find some Eric> sort of way to do this logical separation. Copying the raw tables and ignoring/deleting the data/system/LocationInfo* files has worked for me. But I have to add the disclaimer that I'm definitely a Cassandra newbie! Cheers! Shyamal