Cassandra is streaming it at a near constant rate (if you had metrics for 
network interface, you’d probably see that), but it doesn’t register in 
nodetool status until it completes all of the sstables for a column family. At 
that point, the -tmp–Data.db files get renamed to drop the –tmp, and they 
become live on the node.

I suspect you have a table/CF that’s approximately 47/48gb, and it completed, 
and it’s size in nodetool status jumped at that time.



From:  Felipe Esteves
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, February 26, 2016 at 11:48 AM
To:  "user@cassandra.apache.org"
Subject:  Nodetool Rebuild sending few big packets of data. Is it normal?

Hi, 

I'm running a nodetool rebuild to include a new DC in my cluster.
My config is:
DC1, 2 nodes per rack (2 racks), 70gb each node
DC2, 2 nodes per rack (1 rack), 90gb each node
DC3, 2 nodes per rack (1 rack) (THIS IS THE NEW DC)

What I did was get the 2 nodes in DC3 up and running with bootstrap=false, and 
then ran a rebuild using DC2 as a parameter.

However, when I started, the load in both new nodes rapidly increased to 1.4GB, 
according to nodetool status. And then it was slowly increasing for 4 hours, in 
a 10mb basis. Then, suddenly, 1 node had 49.5GB and the other followed soon.
In the instance logs, I have only stream messages from when I've started the 
rebuild.

My point is, is it normal to Cassandra accumulate this amount of data and then 
send it? I was hoping that it was more of a gradual and incremental proccess.

thanks,

Felipe Esteves

Tecnologia

felipe.este...@b2wdigital.com

Tel.: (21) 3504-7162 ramal 57162





Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to