I am confused. I thought you said you don't have a snapshot. Df/du reports space used by existing data AND the snapshot. Cassandra only reports on space used by actual data........if you move the snapshots, does df/du match what cassandra says?
Dean On 3/28/13 12:05 PM, "Ben Chobot" <be...@instructure.com> wrote: >.....though interestingly, the snapshot of these CFs have the "right" >amount of data in them (i.e. it agrees with the live SSTable size >reported by cassandra). Is it total insanity to remove the files from the >data directory not included in the snapshot, so long as they were created >before the snapshot? > >On Mar 28, 2013, at 10:54 AM, Hiller, Dean wrote: > >> Have you cleaned up your snapshotsÅ those take extra space and don't just >> go away unless you delete them. >> >> Dean >> >> On 3/28/13 11:46 AM, "Ben Chobot" <be...@instructure.com> wrote: >> >>> Are you also running 1.1.5? I'm wondering (ok hoping) that this might >>>be >>> fixed if I upgrade. >>> >>> On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrote: >>> >>>> We occasionally (twice now on a 40 node cluster over the last 6-8 >>>> months) see this. My best guess is that Cassandra can fail to mark an >>>> SSTable for cleanup somehow. Forced GC's or reboots don't clear them >>>> out. We disable thrift and gossip; drain; snapshot; shutdown; clear >>>> data/Keyspace/Table/*.db and restore (hard-linking back into place to >>>> avoid data transfer) from the just created snapshot; restart. >>>> >>>> >>>> On Mar 28, 2013, at 10:12 AM, Ben Chobot <be...@instructure.com> >>>>wrote: >>>> >>>>> Some of my cassandra nodes in my 1.1.5 cluster show a large >>>>> discrepancy between what cassandra says the SSTables should sum up >>>>>to, >>>>> and what df and du claim exist. During repairs, this is almost always >>>>> pretty bad, but post-repair compactions tend to bring those numbers >>>>>to >>>>> within a few percent of each other... usually. Sometimes they remain >>>>> much further apart after compactions have finished - for instance, >>>>>I'm >>>>> looking at one node now that claims to have 205GB of SSTables, but >>>>> actually has 450GB of files living in that CF's data directory. No >>>>> pending compactions, and the most recent compaction for this CF >>>>> finished just a few hours ago. >>>>> >>>>> nodetool cleanup has no effect. >>>>> >>>>> What could be causing these extra bytes, and how to get them to go >>>>> away? I'm ok with a few extra GB of unexplained data, but an extra >>>>> 245GB (more than all the data this node is supposed to have!) is a >>>>> little extreme. >>>> >>> >> >