Oh and since our LCS was 10MB per file it was easy to tell which files did not convert yet. Also, we ended up blowing away a CF on node 5(of 6) and running a full repair on that CF and after he was at a normal size again as well.
Dean On 3/28/13 12:35 PM, "Hiller, Dean" <dean.hil...@nrel.gov> wrote: >We had a runaway STCS like this due to our own mistakes but were not sure >how to clean it up. We went to LCS instead of STCS and that seemed to >bring it way back down since the STCS had repeats and such between >SSTables which LCS avoids mostly. I can't help much more than that info >though. > >Dean > >On 3/28/13 12:31 PM, "Ben Chobot" <be...@instructure.com> wrote: > >>Sorry to make it confusing. I didn't have snapshots on some nodes; I just >>made a snapshot on a node with this problem. >> >>So to be clear, on this one example node.... >> Cassandra reports ~250GB of space used >> In a CF data directory (before snapshots existed), du -sh showed ~550GB >> After the snapshot, du in the same directory still showed ~550GB >>(they're hard links, so that's correct) >> du in the snapshot directory for that CF shows ~250GB, and ls shows ~50 >>fewer files. >> >> >> >>On Mar 28, 2013, at 11:10 AM, Hiller, Dean wrote: >> >>> I am confused. I thought you said you don't have a snapshot. Df/du >>> reports space used by existing data AND the snapshot. Cassandra only >>> reports on space used by actual data........if you move the snapshots, >>>does >>> df/du match what cassandra says? >>> >>> Dean >>> >>> On 3/28/13 12:05 PM, "Ben Chobot" <be...@instructure.com> wrote: >>> >>>> .....though interestingly, the snapshot of these CFs have the "right" >>>> amount of data in them (i.e. it agrees with the live SSTable size >>>> reported by cassandra). Is it total insanity to remove the files from >>>>the >>>> data directory not included in the snapshot, so long as they were >>>>created >>>> before the snapshot? >>>> >>>> On Mar 28, 2013, at 10:54 AM, Hiller, Dean wrote: >>>> >>>>> Have you cleaned up your snapshotsÅ those take extra space and don't >>>>>just >>>>> go away unless you delete them. >>>>> >>>>> Dean >>>>> >>>>> On 3/28/13 11:46 AM, "Ben Chobot" <be...@instructure.com> wrote: >>>>> >>>>>> Are you also running 1.1.5? I'm wondering (ok hoping) that this >>>>>>might >>>>>> be >>>>>> fixed if I upgrade. >>>>>> >>>>>> On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrote: >>>>>> >>>>>>> We occasionally (twice now on a 40 node cluster over the last 6-8 >>>>>>> months) see this. My best guess is that Cassandra can fail to mark >>>>>>>an >>>>>>> SSTable for cleanup somehow. Forced GC's or reboots don't clear >>>>>>>them >>>>>>> out. We disable thrift and gossip; drain; snapshot; shutdown; >>>>>>>clear >>>>>>> data/Keyspace/Table/*.db and restore (hard-linking back into place >>>>>>>to >>>>>>> avoid data transfer) from the just created snapshot; restart. >>>>>>> >>>>>>> >>>>>>> On Mar 28, 2013, at 10:12 AM, Ben Chobot <be...@instructure.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Some of my cassandra nodes in my 1.1.5 cluster show a large >>>>>>>> discrepancy between what cassandra says the SSTables should sum up >>>>>>>> to, >>>>>>>> and what df and du claim exist. During repairs, this is almost >>>>>>>>always >>>>>>>> pretty bad, but post-repair compactions tend to bring those >>>>>>>>numbers >>>>>>>> to >>>>>>>> within a few percent of each other... usually. Sometimes they >>>>>>>>remain >>>>>>>> much further apart after compactions have finished - for instance, >>>>>>>> I'm >>>>>>>> looking at one node now that claims to have 205GB of SSTables, but >>>>>>>> actually has 450GB of files living in that CF's data directory. No >>>>>>>> pending compactions, and the most recent compaction for this CF >>>>>>>> finished just a few hours ago. >>>>>>>> >>>>>>>> nodetool cleanup has no effect. >>>>>>>> >>>>>>>> What could be causing these extra bytes, and how to get them to go >>>>>>>> away? I'm ok with a few extra GB of unexplained data, but an extra >>>>>>>> 245GB (more than all the data this node is supposed to have!) is a >>>>>>>> little extreme. >>>>>>> >>>>>> >>>>> >>>> >>> >> >