Hello,

I'm having some trouble during bootstrap of a replacement node and I'm
suspecting it could be a bug in Cassandra. I'm using C* 1.2.13, RF=2, with
Vnodes disabled. Below is a simplified version of my ring:

* n1 : token 100
* n2 : token 200 (DEAD)
* n3 : token 300
* n4 : token 0

n2 has died, so I tried bootstraping a new replacement node:

* x : token 199 (n2.token-1)

Even though n2 was terminated, and being seen as DOWN by n1, n3 and n4, the
replacement node x was seeing n2 as UP, immediately trying to stream data
from it during bootstrap. After about 10 minutes, when x detected n2 as
DOWN, the bootstrap failed for obvious reasons.

Since the previous procedure did not work, I tried the next procedure for
replacing n2:

- Remove n2 from the ring. This makes n3 stream n2's data to n1.
- After the leave is complete, try to bootstrap X again.

Ideally, x would stream data from n1 and n3, but it always streams data
only from n3. The problem is that at some point n3 is seen as DOWN by x,
failing the bootstrap process again.

I suspect there is some kind of inconsistency in the gossip information of
n2 that is preventing x from streaming data from both n1 and n3. I tried
purging n2 from gossip, using Gossiper.unsafeAssassinateEndpoint() via JMX,
but I'm getting the following error:

*"Problem invoking unsafeAssassinateEndpoint :
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0"*

My next and last approach is to manually copy the sstables via rsync from
n3 and start x with auto_bootstrap=false, but I really didn't want to use
this approach. Is it so hard to bootstrap a new node when not using Vnodes
in C* 1.2, or this could be hiding some kind of bug? Any feedback would be
greatly appreciated.

Cheers,

-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br <http://www.chaordic.com.br/>*

Reply via email to