Hi Rahul, just an idea, did you have a look at the data directorys on disk (/var/lib/cassandra/data)? It could be that there are some from old keyspaces that have been deleted and snapshoted before. Try something like "du -sh /var/lib/cassandra/data/*" to verify which keyspace is consuming your space.
Jan Von meinem iPhone gesendet > Am 14.01.2016 um 07:25 schrieb Rahul Ramesh <rr.ii...@gmail.com>: > > Thanks for your suggestion. > > Compaction was happening on one of the large tables. The disk space did not > decrease much after the compaction. So I ran an external compaction. The disk > space decreased by around 10%. However it is still consuming close to 750Gb > for load of 250Gb. > > I even restarted cassandra thinking there may be some open files. However it > didnt help much. > > Is there any way to find out why so much of data is being consumed? > > I checked if there are any open files using lsof. There are not any open > files. > > Recovery: > Just a wild thought > I am using replication factor of 2 and I have two nodes. If I delete complete > data on one of the node, will I be able to recover all the data from the > active node? > I don't want to pursue this path as I want to find out the root cause of the > issue! > > > Any help will be greatly appreciated > > Thank you, > > Rahul > > > > > > >> On Wed, Jan 13, 2016 at 3:37 PM, Carlos Rolo <r...@pythian.com> wrote: >> You can check if the snapshot exists in the snapshot folder. >> Repairs stream sstables over, than can temporary increase disk space. But I >> think Carlos Alonso might be correct. Running compactions might be the issue. >> >> Regards, >> >> Carlos Juzarte Rolo >> Cassandra Consultant >> >> Pythian - Love your data >> >> rolo@pythian | Twitter: @cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo >> Mobile: +351 91 891 81 00 | Tel: +1 613 565 8696 x1649 >> www.pythian.com >> >>> On Wed, Jan 13, 2016 at 9:24 AM, Carlos Alonso <i...@mrcalonso.com> wrote: >>> I'd have a look also at possible running compactions. >>> >>> If you have big column families with STCS then large compactions may be >>> happening. >>> >>> Check it with nodetool compactionstats >>> >>> Carlos Alonso | Software Engineer | @calonso >>> >>>> On 13 January 2016 at 05:22, Kevin O'Connor <ke...@reddit.com> wrote: >>>> Have you tried restarting? It's possible there's open file handles to >>>> sstables that have been compacted away. You can verify by doing lsof and >>>> grepping for DEL or deleted. >>>> >>>> If it's not that, you can run nodetool cleanup on each node to scan all of >>>> the sstables on disk and remove anything that it's not responsible for. >>>> Generally this would only work if you added nodes recently. >>>> >>>> >>>>> On Tuesday, January 12, 2016, Rahul Ramesh <rr.ii...@gmail.com> wrote: >>>>> We have a 2 node Cassandra cluster with a replication factor of 2. >>>>> >>>>> The load factor on the nodes is around 350Gb >>>>> >>>>> Datacenter: Cassandra >>>>> ========== >>>>> Address Rack Status State Load Owns >>>>> Token >>>>> >>>>> -5072018636360415943 >>>>> 172.31.7.91 rack1 Up Normal 328.5 GB 100.00% >>>>> -7068746880841807701 >>>>> 172.31.7.92 rack1 Up Normal 351.7 GB 100.00% >>>>> -5072018636360415943 >>>>> >>>>> However,if I use df -h, >>>>> >>>>> /dev/xvdf 252G 223G 17G 94% /HDD1 >>>>> /dev/xvdg 493G 456G 12G 98% /HDD2 >>>>> /dev/xvdh 197G 167G 21G 90% /HDD3 >>>>> >>>>> >>>>> HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in one >>>>> of the machine and in another machine it is close to 650Gb. >>>>> >>>>> I started repair 2 days ago, after running repair, the amount of disk >>>>> space consumption has actually increased. >>>>> I also checked if this is because of snapshots. nodetool listsnapshot >>>>> intermittently lists a snapshot but it goes away after sometime. >>>>> >>>>> Can somebody please help me understand, >>>>> 1. why so much disk space is consumed? >>>>> 2. Why did it increase after repair? >>>>> 3. Is there any way to recover from this state. >>>>> >>>>> >>>>> Thanks, >>>>> Rahul >> >> >> -- >> >