Hi All, I've faced an issue with cassandra 2.0.5. I've 6 node cluster with random partitioner, still using tokens instead of vnodes. Cause we're changing hardware we decide to migrate cluster to 6 new machines and change partitioning options to vnode rather then token-based. I've followed instruction on site: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html and started cassandra on 6 new nodes in new DC. Everything seems to work correctly, nodes were seen from all others as up and normal. Then i performed nodetool repair -pr on the first of new nodes. But process falls into infinite loop, sending/receiving merkle trees over and over. It hangs on one very small KS it there were no hope it will stop sometime (process was running whole night). So I decided to stop the repair and restart cass on this particular new node. after restart 'Ive tried repair one more time with another small KS, but it also falls into infinite loop. So i decided to break the procedure of adding datacenter, remove nodes from new DC and start all from scratch. After running removenode on all new nodes I've wiped data dir and start cassandra on new node once again. During the start messages "org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa" appears in logs. Google said, that they may mean problems with schema versions consistency, so I performed describe cluster in cassandra-cli and i get: Cluster Information: Name: Metadata Cluster Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4]
f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3] 16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6] 72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1] 7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5] 85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2] So now I have 6 diff schema version for cluster. But how it can happened? How can I take my cluster to consistent state? What did I wrong during extending cluster, so nodetool falls into infinite loop? At the first sight data looks ok, I can read from cluster and I'm getting expected output. best regards Aleksander