Re: lots of extra bytes on disk

2013-03-28 Thread Wei Zhu
Hi Ben, If affordable, just blow away the node and bootstrap in a replacement/ or restore from snapshot and repair. -Wei - Original Message - From: "Dean Hiller" To: user@cassandra.apache.org Sent: Thursday, March 28, 2013 11:40:21 AM Subject: Re: lots of extra bytes on di

Re: lots of extra bytes on disk

2013-03-28 Thread Hiller, Dean
Oh and since our LCS was 10MB per file it was easy to tell which files did not convert yet. Also, we ended up blowing away a CF on node 5(of 6) and running a full repair on that CF and after he was at a normal size again as well. Dean On 3/28/13 12:35 PM, "Hiller, Dean" wrote: >We had a runawa

Re: lots of extra bytes on disk

2013-03-28 Thread Hiller, Dean
We had a runaway STCS like this due to our own mistakes but were not sure how to clean it up. We went to LCS instead of STCS and that seemed to bring it way back down since the STCS had repeats and such between SSTables which LCS avoids mostly. I can't help much more than that info though. Dean

Re: lots of extra bytes on disk

2013-03-28 Thread Ben Chobot
Sorry to make it confusing. I didn't have snapshots on some nodes; I just made a snapshot on a node with this problem. So to be clear, on this one example node Cassandra reports ~250GB of space used In a CF data directory (before snapshots existed), du -sh showed ~550GB After the snapshot

Re: lots of extra bytes on disk

2013-03-28 Thread Hiller, Dean
I am confused. I thought you said you don't have a snapshot. Df/du reports space used by existing data AND the snapshot. Cassandra only reports on space used by actual dataif you move the snapshots, does df/du match what cassandra says? Dean On 3/28/13 12:05 PM, "Ben Chobot" wrote: >

Re: lots of extra bytes on disk

2013-03-28 Thread Ben Chobot
.though interestingly, the snapshot of these CFs have the "right" amount of data in them (i.e. it agrees with the live SSTable size reported by cassandra). Is it total insanity to remove the files from the data directory not included in the snapshot, so long as they were created before the s

Re: lots of extra bytes on disk

2013-03-28 Thread Ben Chobot
Actually, due to a misconfiguration, we weren't snapshotting at all on some of the nodes that are experiencing this problem. So while we've fixed that, snapshot don't explain the problem. On Mar 28, 2013, at 10:54 AM, Hiller, Dean wrote: > Have you cleaned up your snapshotsÅ those take extra spa

Re: lots of extra bytes on disk

2013-03-28 Thread Hiller, Dean
Have you cleaned up your snapshotsÅ those take extra space and don't just go away unless you delete them. Dean On 3/28/13 11:46 AM, "Ben Chobot" wrote: >Are you also running 1.1.5? I'm wondering (ok hoping) that this might be >fixed if I upgrade. > >On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrot

Re: lots of extra bytes on disk

2013-03-28 Thread Ben Chobot
Are you also running 1.1.5? I'm wondering (ok hoping) that this might be fixed if I upgrade. On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrote: > We occasionally (twice now on a 40 node cluster over the last 6-8 months) see > this. My best guess is that Cassandra can fail to mark an SSTable for

Re: lots of extra bytes on disk

2013-03-28 Thread Lanny Ripple
We occasionally (twice now on a 40 node cluster over the last 6-8 months) see this. My best guess is that Cassandra can fail to mark an SSTable for cleanup somehow. Forced GC's or reboots don't clear them out. We disable thrift and gossip; drain; snapshot; shutdown; clear data/Keyspace/Table/

lots of extra bytes on disk

2013-03-28 Thread Ben Chobot
Some of my cassandra nodes in my 1.1.5 cluster show a large discrepancy between what cassandra says the SSTables should sum up to, and what df and du claim exist. During repairs, this is almost always pretty bad, but post-repair compactions tend to bring those numbers to within a few percent of