Thank you Edward.

I suspect that nodetool cleanup is IO intensive. So running nodetool cleanup 
concurrently on the entire cluster may have a significantly impact the IO  
performance of applications.

Apart from this, do you see any other implications on running the nodetool 
cleanup concurrently on the entire cluster ?

Thank you
Emalayan


________________________________
 From: Edward Capriolo <edlinuxg...@gmail.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>; Emalayan 
Vairavanathan <svemala...@yahoo.com> 
Sent: Monday, 10 June 2013 2:53 PM
Subject: Re: [Cassandra] Expanding a Cassandra cluster
 


You eventually should run cleanup to remove data no longer needed on the node. 
However it does not need to be run quickly after a join. You can run it when 
you get around to it. I would run it on a few nodes at a time until they are 
all cleaned up.




On Mon, Jun 10, 2013 at 5:00 PM, Emalayan Vairavanathan <svemala...@yahoo.com> 
wrote:

Hi All,
>
>
>Datastax manual suggests that during a Cassandra cluster expansion, an 
>administrator has to run nodetool cleanup on each of the previously existing 
>Cassandra nodes to remove the keys that are no longer belonging to those 
>nodes. Further the manual says that thenodetool cleanup  task should be run 
>sequentially on the existing Cassandra nodes.
>
>
>Reference: 
>http://www.datastax.com/docs/1.2/operations/add_replace_nodes#adding-capacity
>
>
>Here is my problem: I have a very large Cassandra cluster with 100s of nodes 
>and running nodetool cleanup sequentially will take a long time to finish. 
>
>
> Questions: a) So can someone tell me  about the implications of running the 
>nodetool cleanup concurrently on the entire cluster ?
>                   b) Will Cassandra automatically take care of removing 
>obsolete keys in future ?
>
>
>
>
>Thank youEmalayan 

Reply via email to