Bringing a dead node back up after fixing hardware issues

Eran Chinthaka Withana Mon, 23 Jul 2012 16:27:11 -0700

Hi,

In my cluster, one of the nodes went down (due to a hardware failure). We
managed to get it fixed in couple of days. But it seems its harder to bring
this same node back into cluster without creating read misses. Here is what
I did.


Method 1: I copied the data from all the nodes in that data center, into
the repaired node, and brought it back up. But because of the rate of
updates happening, the read misses started going up.

Method 2: I issued a removetoken command for that node's token and let the
cluster stream the data into relevant nodes. At the end of this process,
the dead node was not showing up in the ring output. Then I brought the
node back up. I was expecting, Cassandra to first stream data into the new
node (which happens to be the dead node which was in the cluster earlier)
and once its done then make it serve reads. But, in the server log, I can
see as soon the node comes up, it started serving reads, creating a large
number of read misses.

So the question is, what is the best way to bring back a dead node (once
its hardware issues are fixed) without impacting read misses?

Thanks,
Eran Chinthaka Withana

Bringing a dead node back up after fixing hardware issues

Reply via email to