Re: Recovering from a faulty cassandra node

Wei Zhu Tue, 19 Mar 2013 10:13:37 -0700

Hi Dean,
If you are not using VNode and try to replace the node, use the new token as 
old token -1, not +1. The reason is that, the assignment of token is clock wise 
along the ring. If you set your new token to be old token -1, the new node will 
take over all the data of the old node except for one token which was assigned 
to the old node. If you assign new token to be old token + 1, then the new node 
will only streame data of one token. So as a good practice, don't set 0 as your 
node token, start with 100. So it's easier to  go down from 100 than go down 
from 0 (need to caculate 2 ^ 127 - 1)

Hope I didn't confuse you.

-Wei

----- Original Message -----
From: "Dean Hiller" <dean.hil...@nrel.gov>
To: user@cassandra.apache.org
Sent: Tuesday, March 19, 2013 8:25:25 AM
Subject: Re: Recovering from a faulty cassandra node

I have not done this as of yet but from all that I have read your best option 
is to follow the replace node documentation which I belive you need to

 1.  Have the token be the same BUT add 1 to it so it doesn't think it's the 
same computer
 2.  Have the bootstrap option set or something so streaming takes affect.

I would however test that all out in QA to make sure it works and if you have 
QUOROM reads/writes a good part of that test would be to take node X down after 
your node Y is back in the cluster to make sure reads/writes are working on the 
node you fixed…..you just need to make sure node X shares one of the token 
ranges of node Y AND your writes/reads are in that token range.

Dean

From: Jabbar Azam <aja...@gmail.com<mailto:aja...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, March 19, 2013 8:51 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Recovering from a faulty cassandra node

Hello,

I am using Cassandra 1.2.2 on a 4 node test cluster with vnodes. I waited for 
over a week to insert lots of data into the cluster. During the end of the 
process one of the nodes had a hardware fault.

I have fixed the hardware fault but the filing system on that node is corrupt 
so I'll have to reinstall the OS and cassandra.

I can think of two ways of reintegrating the host into the cluster

1) shrink the cluster to three nodes and add the node into the cluster

2) Add the node into the cluster without shrinking

I'm not sure of the best approach to take and I'm not sure how to achieve each 
step.

Can anybody help?

--
Thanks

 Jabbar Azam

Re: Recovering from a faulty cassandra node

Reply via email to