I'm still missing something, please excuse me. Let's say, for example, that I have a 4 node cluster with a replica factor of 2. One node goes down and I have to reinstall it. In the meantime the cluster still works and data is read and written.
After a while the node is reinstalled, same IP is used, and cassandra configuration is restored (but data are not). Wouldn't be enough to just start cassandra, and maybe run a repair ? On which node, and at which point of this scenario should I use decommission and/or removenode ? Il giorno 19/mar/2013, alle ore 16:56, Alain RODRIGUEZ <arodr...@gmail.com> ha scritto: > Decommission doesn't need a RF > 1 since it is run from the node being > removed from the cluster. It gives the data to the next node in the ring, > that will be responsible for it before leaving. > Removenode (At least if it is like the old removetoken) use replicas to > dispatch the data to their new nodes. So yes, this one needs a RF > 1, but > has the advantage that it can be used having a node totally unreachable. > > But anyway having a RF = 1 is pretty bad since you have a SPOF (Single Point > Of Failure) which can be avoided by C* with a higher RF. > > Alain > > > 2013/3/19 Marco Matarazzo <marco.matara...@hexkeep.com> > Is nodetool removenode / decommission actually needed having a RF > 1 ? What > does it do, exactly ? > > Il giorno 19/mar/2013, alle ore 16:45, Alain RODRIGUEZ <arodr...@gmail.com> > ha scritto: > > > In 1.2, you may want to use the nodetool removenode if your server i broken > > or unreachable, else I guess nodetool decommission remains the good way to > > remove a node. (http://www.datastax.com/docs/1.2/references/nodetool) > > > > When this node is out, rm -rf /yourpath/cassandra/* on this serveur, change > > the configuration if needed (not sure about the auto_bootstrap param) and > > start Cassandra on that node again. It should join the ring as a new node. > > > > Good luck. > > > > > > 2013/3/19 Hiller, Dean <dean.hil...@nrel.gov> > > Since you "cleared" out that node, it IS the replacement node. > > > > Dean > > > > From: Jabbar Azam <aja...@gmail.com<mailto:aja...@gmail.com>> > > Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" > > <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > > Date: Tuesday, March 19, 2013 9:29 AM > > To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" > > <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > > Subject: Re: Recovering from a faulty cassandra node > > > > Hello Dean. > > > > I'm using vnodes so can't specify a token. In addition I can't follow the > > replace node docs because I don't have a replacement node. > > > > > > On 19 March 2013 15:25, Hiller, Dean > > <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>> wrote: > > I have not done this as of yet but from all that I have read your best > > option is to follow the replace node documentation which I belive you need > > to > > > > > > 1. Have the token be the same BUT add 1 to it so it doesn't think it's > > the same computer > > 2. Have the bootstrap option set or something so streaming takes affect. > > > > I would however test that all out in QA to make sure it works and if you > > have QUOROM reads/writes a good part of that test would be to take node X > > down after your node Y is back in the cluster to make sure reads/writes are > > working on the node you fixed…..you just need to make sure node X shares > > one of the token ranges of node Y AND your writes/reads are in that token > > range. > > > > Dean > > > > From: Jabbar Azam > > <aja...@gmail.com<mailto:aja...@gmail.com><mailto:aja...@gmail.com<mailto:aja...@gmail.com>>> > > Reply-To: > > "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" > > > > <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>> > > Date: Tuesday, March 19, 2013 8:51 AM > > To: > > "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" > > > > <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>> > > Subject: Recovering from a faulty cassandra node > > > > Hello, > > > > I am using Cassandra 1.2.2 on a 4 node test cluster with vnodes. I waited > > for over a week to insert lots of data into the cluster. During the end of > > the process one of the nodes had a hardware fault. > > > > I have fixed the hardware fault but the filing system on that node is > > corrupt so I'll have to reinstall the OS and cassandra. > > > > I can think of two ways of reintegrating the host into the cluster > > > > 1) shrink the cluster to three nodes and add the node into the cluster > > > > 2) Add the node into the cluster without shrinking > > > > I'm not sure of the best approach to take and I'm not sure how to achieve > > each step. > > > > Can anybody help? > > > > > > -- > > Thanks > > > > Jabbar Azam > > > > > > > > -- > > Thanks > > > > Jabbar Azam > > > > -- > Marco Matarazzo > == Hex Keep == > > W: http://www.hexkeep.com > M: +39 347 8798528 > E: marco.matara...@hexkeep.com > > "You can learn more about a man > in one hour of play > than in one year of conversation.” - Plato > > > > > -- Marco Matarazzo == Hex Keep == W: http://www.hexkeep.com M: +39 347 8798528 E: marco.matara...@hexkeep.com "You can learn more about a man in one hour of play than in one year of conversation.” - Plato