Hi all, Sorry I couldn't update earlier as I got caught up in some other stuff.
Anyway, my previous 3 node cluster was on version 3.9. I created a new cluster of cassandra 3.11.2 with same number of nodes on GCE VMs instead of DC/OS. My existing cluster has cassandra data on persistent disks. I made copies of those disks and attached them to new cluster. I was using the following link to move data to the new cluster: https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html As mentioned in the link, I manually assigned token ranges to each node according to their corresponding node in the previous cluster. When I restarted cassandra process on the VMs, I noticed that it had automatically picked up all my keyspaces and column families. I did not recreate schema or copy data manually or run sstablesloader. I am not sure if this should have happened. Anyway, the data in both clusters is still not in sync. I ran a simple count query on a table both clusters and got different results: Old cluster: 217699 New Cluster: 138770 On the new cluster, when I run nodetool repair for my keyspace, it runs fine on one node, but on other two nodes it says that keyspace replication factor is 1 so repair is not needed. Cqlsh also shows that the replication factor is 2. Nodetool status on new and old cluster shows different outputs for each cluster as well. *Cluster1:* Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.128.1.1 228.14 GiB 256 ? 63ff8054-934a-4a7a-a33f-405e064bc8e8 rack1 UN 10.128.1.2 231.25 GiB 256 ? 702e8a31-6441-4444-b569-d2d137d54a5d rack1 UN 10.128.1.3 199.91 GiB 256 ? b5b22a90-f037-433a-8ad9-f370b26cca26 rack1 *Cluster2:* Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UJ 10.142.0.4 211.27 GiB 256 ? c55fef77-9c78-449c-b0d9-64e755caee7d rack1 UN 10.142.0.2 228.14 GiB 256 ? 0065c8e1-47be-4cf8-a3fe-3f4d20ff1b47 rack1 UJ 10.142.0.3 241.77 GiB 256 ? f3b3f409-d108-4751-93ba-682692e46318 rack1 This is weird because both the clusters have essentially same disks attached to them. Only one node (10.142.0.2) in cluster2 has the same load as its counterpart in the cluster1 (10.128.1.1). This is also the node where nodetool repair seems to be running fine and it is also acting as the seed node in second cluster. I am confused that what might be causing this inconsistency in load and replication factor? Has anyone ever seen different replication factor for same keyspace on different nodes? Is there a problem in my workflow? Can anyone please suggest the best way to move data from one cluster to another? Any help will be greatly appreciated. On Tue, Apr 17, 2018 at 6:52 AM, Faraz Mateen <fmat...@an10.io> wrote: > Thanks for the response guys. > > Let me try setting token ranges manually and move the data again to > correct nodes. Will update with the outcome soon. > > > On Tue, Apr 17, 2018 at 5:42 AM, kurt greaves <k...@instaclustr.com> > wrote: > >> Sorry for the delay. >> >>> Is the problem related to token ranges? How can I find out token range >>> for each node? >>> What can I do to further debug and root cause this? >> >> Very likely. See below. >> >> My previous cluster has 3 nodes but replication factor is 2. I am not >>> exactly sure how I would handle the tokens. Can you explain that a bit? >> >> The new cluster will have to have the same token ring as the old if you >> are copying from node to node. Basically you should get the set of tokens >> for each node (from nodetool ring) and when you spin up your 3 new nodes, >> set initial_tokens in the yaml to be the comma-separated list of tokens for >> *exactly >> one* node from the previous cluster. When restoring the SSTables you >> need to make sure you take the SSTables from the original node and place it >> on the new node that has the *same* list of tokens. If you don't do this >> it won't be a replica for all the data in those SSTables and consequently >> you'll lose data (or it simply won't be available). >> >> > > > > -- > Faraz Mateen > -- Faraz Mateen