Thanks a lot for yours answers.
2013/11/29 John Sanda <john.sa...@gmail.com> > Couldn't another reason for doing cleanup sequentially be to avoid data > loss? If data is being streamed from a node during bootstrap and cleanup is > run too soon, couldn't you wind up in a situation with data loss if the new > node being bootstrapped goes down (permanently)? > > > On Thu, Nov 28, 2013 at 8:59 PM, Aaron Morton <aa...@thelastpickle.com>wrote: > >> I hope I get this right :) >> >> Thanks for contributing :) >> >> a repair will trigger a mayor compaction on your node which will take up >> a lot of CPU and IO performance. It needs to do this to build up the data >> structure that is used for the repair. After the compaction this is >> streamed to the different nodes in order to repair them. >> >> It does not trigger a major compaction, that’s what we call running >> compaction on the command line and compacting all SSTables into one big >> one. >> >> it will flush all the data to disk that will create some additional >> compaction. >> >> The major concern is that s a disk IO intensive operation, it reads all >> the data and writes data to new SSTables (a one to one mapping). If you >> have all nodes doing this at the same time there may be some degraded >> performance. And as it’s all nodes it’s not possible for the Dynamic Snitch >> to avoid nodes if they are overloaded. >> >> Cleanup is less intensive than repair, but it’s still a good idea to >> stagger it. If you need to run it on all machines (or you have very >> powerful machines) it’s probably going to be OK. >> >> Hope that helps. >> >> ----------------- >> Aaron Morton >> New Zealand >> @aaronmorton >> >> Co-Founder & Principal Consultant >> Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> On 26/11/2013, at 5:14 am, Artur Kronenberg < >> artur.kronenb...@openmarket.com> wrote: >> >> Hi Julien, >> >> I hope I get this right :) >> >> a repair will trigger a mayor compaction on your node which will take up >> a lot of CPU and IO performance. It needs to do this to build up the data >> structure that is used for the repair. After the compaction this is >> streamed to the different nodes in order to repair them. >> >> If you trigger this on every node simultaneously you basically take the >> performance away from your cluster. I would expect cassandra still to >> function, just way slower then before. Triggering it node after node will >> leave your cluster with more resources to handle incoming requests. >> >> >> Cheers, >> >> Artur >> On 25/11/13 15:12, Julien Campan wrote: >> >> Hi, >> >> I'm working with Cassandra 1.2.2 and I have a question about nodetool >> cleanup. >> In the documentation , it's writted " Wait for cleanup to complete on >> one node before doing the next" >> >> I would like to know, why we can't perform a lot of cleanup in a same >> time ? >> >> >> Thanks >> >> >> >> >> > > > -- > > - John >