Cassandra is streaming it at a near constant rate (if you had metrics for network interface, you’d probably see that), but it doesn’t register in nodetool status until it completes all of the sstables for a column family. At that point, the -tmp–Data.db files get renamed to drop the –tmp, and they become live on the node.
I suspect you have a table/CF that’s approximately 47/48gb, and it completed, and it’s size in nodetool status jumped at that time. From: Felipe Esteves Reply-To: "user@cassandra.apache.org" Date: Friday, February 26, 2016 at 11:48 AM To: "user@cassandra.apache.org" Subject: Nodetool Rebuild sending few big packets of data. Is it normal? Hi, I'm running a nodetool rebuild to include a new DC in my cluster. My config is: DC1, 2 nodes per rack (2 racks), 70gb each node DC2, 2 nodes per rack (1 rack), 90gb each node DC3, 2 nodes per rack (1 rack) (THIS IS THE NEW DC) What I did was get the 2 nodes in DC3 up and running with bootstrap=false, and then ran a rebuild using DC2 as a parameter. However, when I started, the load in both new nodes rapidly increased to 1.4GB, according to nodetool status. And then it was slowly increasing for 4 hours, in a 10mb basis. Then, suddenly, 1 node had 49.5GB and the other followed soon. In the instance logs, I have only stream messages from when I've started the rebuild. My point is, is it normal to Cassandra accumulate this amount of data and then send it? I was hoping that it was more of a gradual and incremental proccess. thanks, Felipe Esteves Tecnologia felipe.este...@b2wdigital.com Tel.: (21) 3504-7162 ramal 57162
smime.p7s
Description: S/MIME cryptographic signature