I'd have a look also at possible running compactions. If you have big column families with STCS then large compactions may be happening.
Check it with nodetool compactionstats Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso> On 13 January 2016 at 05:22, Kevin O'Connor <ke...@reddit.com> wrote: > Have you tried restarting? It's possible there's open file handles to > sstables that have been compacted away. You can verify by doing lsof and > grepping for DEL or deleted. > > If it's not that, you can run nodetool cleanup on each node to scan all of > the sstables on disk and remove anything that it's not responsible for. > Generally this would only work if you added nodes recently. > > > On Tuesday, January 12, 2016, Rahul Ramesh <rr.ii...@gmail.com> wrote: > >> We have a 2 node Cassandra cluster with a replication factor of 2. >> >> The load factor on the nodes is around 350Gb >> >> Datacenter: Cassandra >> ========== >> Address Rack Status State Load Owns >> Token >> >> -5072018636360415943 >> 172.31.7.91 rack1 Up Normal 328.5 GB 100.00% >> -7068746880841807701 >> 172.31.7.92 rack1 Up Normal 351.7 GB 100.00% >> -5072018636360415943 >> >> However,if I use df -h, >> >> /dev/xvdf 252G 223G 17G 94% /HDD1 >> /dev/xvdg 493G 456G 12G 98% /HDD2 >> /dev/xvdh 197G 167G 21G 90% /HDD3 >> >> >> HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in one >> of the machine and in another machine it is close to 650Gb. >> >> I started repair 2 days ago, after running repair, the amount of disk >> space consumption has actually increased. >> I also checked if this is because of snapshots. nodetool listsnapshot >> intermittently lists a snapshot but it goes away after sometime. >> >> Can somebody please help me understand, >> 1. why so much disk space is consumed? >> 2. Why did it increase after repair? >> 3. Is there any way to recover from this state. >> >> >> Thanks, >> Rahul >> >>