You eventually should run cleanup to remove data no longer needed on the
node. However it does not need to be run quickly after a join. You can run
it when you get around to it. I would run it on a few nodes at a time until
they are all cleaned up.


On Mon, Jun 10, 2013 at 5:00 PM, Emalayan Vairavanathan <
svemala...@yahoo.com> wrote:

> Hi All,
>
> Datastax manual suggests that during a Cassandra cluster expansion, an
> administrator has to run nodetool cleanup on each of the previously
> existing Cassandra nodes to remove the keys that are no longer belonging to
> those nodes. Further the manual says that the nodetool cleanup  task
> should be run sequentially on the existing Cassandra nodes.
>
> Reference:
> http://www.datastax.com/docs/1.2/operations/add_replace_nodes#adding-capacity
>
> Here is my problem: I have a very large Cassandra cluster with 100s of
> nodes and running nodetool cleanup sequentially will take a long time to
> finish.
>
>  Questions: a) So can someone tell me  about the implications of running
> the nodetool cleanup concurrently on the entire cluster ?
>                    b) Will Cassandra automatically take care of removing
> obsolete keys in future ?
>
>
> Thank you
> Emalayan
>

Reply via email to