Hi Jan, I checked it. There are no old Key Spaces or tables. Thanks for your pointer, I started looking inside the directories. I see lot of snapshots directory inside the table directory. These directories are consuming space.
However these snapshots are not shown when I issue listsnapshots ./bin/nodetool listsnapshots Snapshot Details: There are no snapshots Can I safely delete those snapshots? why listsnapshots is not showing the snapshots? Also in future, how can we find out if there are snapshots? Thanks, Rahul On Thu, Jan 14, 2016 at 12:50 PM, Jan Kesten <j.kes...@enercast.de> wrote: > Hi Rahul, > > just an idea, did you have a look at the data directorys on disk > (/var/lib/cassandra/data)? It could be that there are some from old > keyspaces that have been deleted and snapshoted before. Try something like > "du -sh /var/lib/cassandra/data/*" to verify which keyspace is consuming > your space. > > Jan > > Von meinem iPhone gesendet > > Am 14.01.2016 um 07:25 schrieb Rahul Ramesh <rr.ii...@gmail.com>: > > Thanks for your suggestion. > > Compaction was happening on one of the large tables. The disk space did > not decrease much after the compaction. So I ran an external compaction. > The disk space decreased by around 10%. However it is still consuming close > to 750Gb for load of 250Gb. > > I even restarted cassandra thinking there may be some open files. However > it didnt help much. > > Is there any way to find out why so much of data is being consumed? > > I checked if there are any open files using lsof. There are not any open > files. > > *Recovery:* > Just a wild thought > I am using replication factor of 2 and I have two nodes. If I delete > complete data on one of the node, will I be able to recover all the data > from the active node? > I don't want to pursue this path as I want to find out the root cause of > the issue! > > > Any help will be greatly appreciated > > Thank you, > > Rahul > > > > > > > On Wed, Jan 13, 2016 at 3:37 PM, Carlos Rolo <r...@pythian.com> wrote: > >> You can check if the snapshot exists in the snapshot folder. >> Repairs stream sstables over, than can temporary increase disk space. But >> I think Carlos Alonso might be correct. Running compactions might be the >> issue. >> >> Regards, >> >> Carlos Juzarte Rolo >> Cassandra Consultant >> >> Pythian - Love your data >> >> rolo@pythian | Twitter: @cjrolo | Linkedin: >> *linkedin.com/in/carlosjuzarterolo >> <http://linkedin.com/in/carlosjuzarterolo>* >> Mobile: +351 91 891 81 00 | Tel: +1 613 565 8696 x1649 >> www.pythian.com >> >> On Wed, Jan 13, 2016 at 9:24 AM, Carlos Alonso <i...@mrcalonso.com> >> wrote: >> >>> I'd have a look also at possible running compactions. >>> >>> If you have big column families with STCS then large compactions may be >>> happening. >>> >>> Check it with nodetool compactionstats >>> >>> Carlos Alonso | Software Engineer | @calonso >>> <https://twitter.com/calonso> >>> >>> On 13 January 2016 at 05:22, Kevin O'Connor <ke...@reddit.com> wrote: >>> >>>> Have you tried restarting? It's possible there's open file handles to >>>> sstables that have been compacted away. You can verify by doing lsof and >>>> grepping for DEL or deleted. >>>> >>>> If it's not that, you can run nodetool cleanup on each node to scan all >>>> of the sstables on disk and remove anything that it's not responsible for. >>>> Generally this would only work if you added nodes recently. >>>> >>>> >>>> On Tuesday, January 12, 2016, Rahul Ramesh <rr.ii...@gmail.com> wrote: >>>> >>>>> We have a 2 node Cassandra cluster with a replication factor of 2. >>>>> >>>>> The load factor on the nodes is around 350Gb >>>>> >>>>> Datacenter: Cassandra >>>>> ========== >>>>> Address Rack Status State Load Owns >>>>> Token >>>>> >>>>> -5072018636360415943 >>>>> 172.31.7.91 rack1 Up Normal 328.5 GB 100.00% >>>>> -7068746880841807701 >>>>> 172.31.7.92 rack1 Up Normal 351.7 GB 100.00% >>>>> -5072018636360415943 >>>>> >>>>> However,if I use df -h, >>>>> >>>>> /dev/xvdf 252G 223G 17G 94% /HDD1 >>>>> /dev/xvdg 493G 456G 12G 98% /HDD2 >>>>> /dev/xvdh 197G 167G 21G 90% /HDD3 >>>>> >>>>> >>>>> HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in >>>>> one of the machine and in another machine it is close to 650Gb. >>>>> >>>>> I started repair 2 days ago, after running repair, the amount of disk >>>>> space consumption has actually increased. >>>>> I also checked if this is because of snapshots. nodetool listsnapshot >>>>> intermittently lists a snapshot but it goes away after sometime. >>>>> >>>>> Can somebody please help me understand, >>>>> 1. why so much disk space is consumed? >>>>> 2. Why did it increase after repair? >>>>> 3. Is there any way to recover from this state. >>>>> >>>>> >>>>> Thanks, >>>>> Rahul >>>>> >>>>> >>> >> >> -- >> >> >> >> >