Sorry to make it confusing. I didn't have snapshots on some nodes; I just made a snapshot on a node with this problem.
So to be clear, on this one example node.... Cassandra reports ~250GB of space used In a CF data directory (before snapshots existed), du -sh showed ~550GB After the snapshot, du in the same directory still showed ~550GB (they're hard links, so that's correct) du in the snapshot directory for that CF shows ~250GB, and ls shows ~50 fewer files. On Mar 28, 2013, at 11:10 AM, Hiller, Dean wrote: > I am confused. I thought you said you don't have a snapshot. Df/du > reports space used by existing data AND the snapshot. Cassandra only > reports on space used by actual data........if you move the snapshots, does > df/du match what cassandra says? > > Dean > > On 3/28/13 12:05 PM, "Ben Chobot" <be...@instructure.com> wrote: > >> .....though interestingly, the snapshot of these CFs have the "right" >> amount of data in them (i.e. it agrees with the live SSTable size >> reported by cassandra). Is it total insanity to remove the files from the >> data directory not included in the snapshot, so long as they were created >> before the snapshot? >> >> On Mar 28, 2013, at 10:54 AM, Hiller, Dean wrote: >> >>> Have you cleaned up your snapshotsÅ those take extra space and don't just >>> go away unless you delete them. >>> >>> Dean >>> >>> On 3/28/13 11:46 AM, "Ben Chobot" <be...@instructure.com> wrote: >>> >>>> Are you also running 1.1.5? I'm wondering (ok hoping) that this might >>>> be >>>> fixed if I upgrade. >>>> >>>> On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrote: >>>> >>>>> We occasionally (twice now on a 40 node cluster over the last 6-8 >>>>> months) see this. My best guess is that Cassandra can fail to mark an >>>>> SSTable for cleanup somehow. Forced GC's or reboots don't clear them >>>>> out. We disable thrift and gossip; drain; snapshot; shutdown; clear >>>>> data/Keyspace/Table/*.db and restore (hard-linking back into place to >>>>> avoid data transfer) from the just created snapshot; restart. >>>>> >>>>> >>>>> On Mar 28, 2013, at 10:12 AM, Ben Chobot <be...@instructure.com> >>>>> wrote: >>>>> >>>>>> Some of my cassandra nodes in my 1.1.5 cluster show a large >>>>>> discrepancy between what cassandra says the SSTables should sum up >>>>>> to, >>>>>> and what df and du claim exist. During repairs, this is almost always >>>>>> pretty bad, but post-repair compactions tend to bring those numbers >>>>>> to >>>>>> within a few percent of each other... usually. Sometimes they remain >>>>>> much further apart after compactions have finished - for instance, >>>>>> I'm >>>>>> looking at one node now that claims to have 205GB of SSTables, but >>>>>> actually has 450GB of files living in that CF's data directory. No >>>>>> pending compactions, and the most recent compaction for this CF >>>>>> finished just a few hours ago. >>>>>> >>>>>> nodetool cleanup has no effect. >>>>>> >>>>>> What could be causing these extra bytes, and how to get them to go >>>>>> away? I'm ok with a few extra GB of unexplained data, but an extra >>>>>> 245GB (more than all the data this node is supposed to have!) is a >>>>>> little extreme. >>>>> >>>> >>> >> >