Are you replacing the node with the same token and the same IP?
On Thu, Mar 13, 2014 at 4:36 AM, Paulo Ricardo Motta Gomes < paulo.mo...@chaordicsystems.com> wrote: > Some further info: > > I'm not using Vnodes, so I'm using the 1.1 replace node trick of setting > the initial_token in the cassandra.yaml file to the value of the dead > node's token -1, and autobootstrap=true. However, according to the Apache > wiki ( > https://wiki.apache.org/cassandra/Operations#For_versions_1.2.0_and_above), > on 1.2 you should actually remove the dead node from the ring, before > adding a replacement node. > > Does that mean the trick of setting the initial token to the value of the > dead node's -1 (described in > http://www.datastax.com/docs/1.1/cluster_management#replacing-a-dead-node) is > not valid anymore in 1.2 without vnodes? > > > On Wed, Mar 12, 2014 at 5:57 PM, Paulo Ricardo Motta Gomes < > paulo.mo...@chaordicsystems.com> wrote: > >> Hello, >> >> I'm trying to replace a dead node using the procedure in [1], but the >> replacement node initially sees the dead node as UP, and after a few >> minutes the node is marked as DOWN again, failing the streaming/bootstrap >> procedure of the replacement node. This dead node is always seen as DOWN by >> the rest of the cluster. >> >> Could this be a bug? I can easily reproduce it in our production >> environment, but don't know if it's reproducible in a clean environment. >> >> Version: 1.2.13 >> >> Here is the log from the replacement node (192.168.1.10 is the dead node): >> >> INFO [GossipStage:1] 2014-03-12 20:25:41,089 Gossiper.java (line 843) >> Node /192.168.1.10 is now part of the cluster >> INFO [GossipStage:1] 2014-03-12 20:25:41,090 Gossiper.java (line 809) >> InetAddress /192.168.1.10 is now UP >> INFO [GossipTasks:1] 2014-03-12 20:34:54,238 Gossiper.java (line 823) >> InetAddress /192.168.1.10 is now DOWN >> ERROR [GossipTasks:1] 2014-03-12 20:34:54,240 AbstractStreamSession.java >> (line 110) Stream failed because /192.168.1.10 died or was >> restarted/removed (streams may still be active in background, but further >> streams won't be started) >> WARN [GossipTasks:1] 2014-03-12 20:34:54,240 RangeStreamer.java (line >> 246) Streaming from /192.168.1.10 failed >> ERROR [GossipTasks:1] 2014-03-12 20:34:54,240 AbstractStreamSession.java >> (line 110) Stream failed because /192.168.1.10 died or was >> restarted/removed (streams may still be active in background, but further >> streams won't be started) >> WARN [GossipTasks:1] 2014-03-12 20:34:54,241 RangeStreamer.java (line >> 246) Streaming from /192.168.1.10 failed >> >> [1] >> http://www.datastax.com/docs/1.1/cluster_management#replacing-a-dead-node >> >> >> Cheers, >> >> Paulo >> >> -- >> *Paulo Motta* >> >> Chaordic | *Platform* >> *www.chaordic.com.br <http://www.chaordic.com.br/>* >> +55 48 3232.3200 >> +55 83 9690-1314 >> > > > > -- > *Paulo Motta* > > Chaordic | *Platform* > *www.chaordic.com.br <http://www.chaordic.com.br/>* > +55 48 3232.3200 > +55 83 9690-1314 >