Cassandra is consuming a lot of disk space

Rahul Ramesh Tue, 12 Jan 2016 21:18:57 -0800

We have a 2 node Cassandra cluster with a replication factor of 2.

The load factor on the nodes is around 350Gb


Datacenter: Cassandra
==========
Address      Rack        Status State   Load            Owns
 Token

-5072018636360415943
172.31.7.91  rack1       Up     Normal  328.5 GB        100.00%
-7068746880841807701
172.31.7.92  rack1       Up     Normal  351.7 GB        100.00%
-5072018636360415943

However,if I use df -h,

/dev/xvdf       252G  223G   17G  94% /HDD1
/dev/xvdg       493G  456G   12G  98% /HDD2
/dev/xvdh       197G  167G   21G  90% /HDD3


HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in one of
the machine and in another machine it is close to 650Gb.

I started repair 2 days ago, after running repair, the amount of disk space
consumption has actually increased.
I also checked if this is because of snapshots. nodetool listsnapshot
intermittently lists a snapshot but it goes away after sometime.

Can somebody please help me understand,
1. why so much disk space is consumed?
2. Why did it increase after repair?
3. Is there any way to recover from this state.


Thanks,
Rahul

Cassandra is consuming a lot of disk space

Reply via email to