Hi Dean, If you are not using VNode and try to replace the node, use the new token as old token -1, not +1. The reason is that, the assignment of token is clock wise along the ring. If you set your new token to be old token -1, the new node will take over all the data of the old node except for one token which was assigned to the old node. If you assign new token to be old token + 1, then the new node will only streame data of one token. So as a good practice, don't set 0 as your node token, start with 100. So it's easier to go down from 100 than go down from 0 (need to caculate 2 ^ 127 - 1)
Hope I didn't confuse you. -Wei ----- Original Message ----- From: "Dean Hiller" <dean.hil...@nrel.gov> To: user@cassandra.apache.org Sent: Tuesday, March 19, 2013 8:25:25 AM Subject: Re: Recovering from a faulty cassandra node I have not done this as of yet but from all that I have read your best option is to follow the replace node documentation which I belive you need to 1. Have the token be the same BUT add 1 to it so it doesn't think it's the same computer 2. Have the bootstrap option set or something so streaming takes affect. I would however test that all out in QA to make sure it works and if you have QUOROM reads/writes a good part of that test would be to take node X down after your node Y is back in the cluster to make sure reads/writes are working on the node you fixed…..you just need to make sure node X shares one of the token ranges of node Y AND your writes/reads are in that token range. Dean From: Jabbar Azam <aja...@gmail.com<mailto:aja...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Tuesday, March 19, 2013 8:51 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Recovering from a faulty cassandra node Hello, I am using Cassandra 1.2.2 on a 4 node test cluster with vnodes. I waited for over a week to insert lots of data into the cluster. During the end of the process one of the nodes had a hardware fault. I have fixed the hardware fault but the filing system on that node is corrupt so I'll have to reinstall the OS and cassandra. I can think of two ways of reintegrating the host into the cluster 1) shrink the cluster to three nodes and add the node into the cluster 2) Add the node into the cluster without shrinking I'm not sure of the best approach to take and I'm not sure how to achieve each step. Can anybody help? -- Thanks Jabbar Azam