Re: Reclaim deleted rows space

2011-01-12 Thread David Boxenhorn
I think that if SSTs are partitioned within the node using RP, so that each partition is small and can be compacted independently of all other partitions, you can implement an algorithm that will spread out the work of compaction over time so that it never takes a node out of commission, as it does

Re: Reclaim deleted rows space

2011-01-10 Thread Jonathan Ellis
I'd suggest describing your approach on https://issues.apache.org/jira/browse/CASSANDRA-1608, and if it's attractive, porting it to 0.8. It's too late for us to make deep changes in 0.6 and probably even 0.7 for the sake of stability. On Mon, Jan 10, 2011 at 8:00 AM, shimi wrote: > I modified th

Re: Reclaim deleted rows space

2011-01-10 Thread shimi
I modified the code to limit the size of the SSTables. I will be glad if someone can take a look at it https://github.com/Shimi/cassandra/tree/cassandra-0.6 Shimi On Fri, Jan 7, 2011 at 2:04 AM, Jonathan Shook wrote: > I believe the follow

Re: Reclaim deleted rows space

2011-01-06 Thread Jonathan Shook
I believe the following condition within submitMinorIfNeeded(...) determines whether to continue, so it's not a hard loop. // if (sstables.size() >= minThreshold) ... On Thu, Jan 6, 2011 at 2:51 AM, shimi wrote: > According to the code it make sense. > submitMinorIfNeeded() calls doCompaction(

Re: Reclaim deleted rows space

2011-01-06 Thread shimi
According to the code it make sense. submitMinorIfNeeded() calls doCompaction() which calls submitMinorIfNeeded(). With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always run compaction. Shimi On Thu, Jan 6, 2011 at 10:26 AM, shimi wrote: > > > On Wed, Jan 5, 2011 at 11:31 PM, Jon

Re: Reclaim deleted rows space

2011-01-06 Thread shimi
On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis wrote: > Pretty sure there's logic in there that says "don't bother compacting > a single sstable." No. You can do it. Based on the log I have a feeling that it triggers an infinite compaction loop. > On Wed, Jan 5, 2011 at 2:26 PM, shimi wrote

Re: Reclaim deleted rows space

2011-01-06 Thread shimi
Am I missing something here? It is already possible to trigger major compaction on a specific CF. On Thu, Jan 6, 2011 at 4:50 AM, Tyler Hobbs wrote: > Although it's not exactly the ability to list specific SSTables, the > ability to only compact specific CFs will be in upcoming releases: > > htt

Re: Reclaim deleted rows space

2011-01-05 Thread Tyler Hobbs
Although it's not exactly the ability to list specific SSTables, the ability to only compact specific CFs will be in upcoming releases: https://issues.apache.org/jira/browse/CASSANDRA-1812 - Tyler On Wed, Jan 5, 2011 at 7:46 PM, Edward Capriolo wrote: > On Wed, Jan 5, 2011 at 4:31 PM, Jonathan

Re: Reclaim deleted rows space

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis wrote: > Pretty sure there's logic in there that says "don't bother compacting > a single sstable." > > On Wed, Jan 5, 2011 at 2:26 PM, shimi wrote: >> How does minor compaction is triggered? Is it triggered Only when a new >> SStable is added? >> >>

Re: Reclaim deleted rows space

2011-01-05 Thread Jonathan Ellis
Pretty sure there's logic in there that says "don't bother compacting a single sstable." On Wed, Jan 5, 2011 at 2:26 PM, shimi wrote: > How does minor compaction is triggered? Is it triggered Only when a new > SStable is added? > > I was wondering if triggering a compaction with minimumCompaction

Re: Reclaim deleted rows space

2011-01-05 Thread shimi
How does minor compaction is triggered? Is it triggered Only when a new SStable is added? I was wondering if triggering a compaction with minimumCompactionThreshold set to 1 would be useful. If this can happen I assume it will do compaction on files with similar size and remove deleted rows on the

Re: Reclaim deleted rows space

2011-01-04 Thread Peter Schuller
> I don't have a problem with disk space. I have a problem with the data > size. [snip] > Bottom line is that I want to reduce the number of requests that goes to > disk. Since there is enough data that is no longer valid I can do it by > reclaiming the space. The only way to do it is by running

Re: Reclaim deleted rows space

2011-01-04 Thread shimi
Yes I am aware of that. This is the reason I upgraded to 0.6.8. Still all the deleted rows in the biggest SSTable will be remove in a major compaction Shimi On Tue, Jan 4, 2011 at 6:40 PM, Robert Coli wrote: > On Tue, Jan 4, 2011 at 4:33 AM, Peter Schuller > wrote: > > For some cases this will

Re: Reclaim deleted rows space

2011-01-04 Thread Robert Coli
On Tue, Jan 4, 2011 at 4:33 AM, Peter Schuller wrote: > For some cases this will be beneficial, but not always. It's been > further improved for 0.7 too w.r.t. tomb stone handling in non-major > compactions (I don't have the JIRA ticket number handy). https://issues.apache.org/jira/browse/CASSAND

Re: Reclaim deleted rows space

2011-01-04 Thread shimi
I think I didn't make myself clear. I don't have a problem with disk space. I have a problem with the data size. I have a simple crud application. Most of the requests are read but there are update/delete and when the time pass the number of deleted rows is big enough in order to free some disk spa

Re: Reclaim deleted rows space

2011-01-04 Thread Peter Schuller
> This is what I thought. I was wishing there might be another way to reclaim > the space. Be sure you really need this first :) Normally you just let it happen in the bg. > The problem is that the more data you have the more time it will take to > Cassandra to response. Relative to what though?

Re: Reclaim deleted rows space

2011-01-04 Thread shimi
This is what I thought. I was wishing there might be another way to reclaim the space. The problem is that the more data you have the more time it will take to Cassandra to response. Reclaim space of deleted rows in the biggest SSTable requires Major compaction. This compaction can be triggered by

Re: Reclaim deleted rows space

2011-01-03 Thread Peter Schuller
> Major compaction does it, but only if GCGraceSeconds has elapsed. See: > >   http://spyced.blogspot.com/2010/02/distributed-deletes-in-cassandra.html But to be clear, under the assumption that your data is a lot smaller than the tombstones, a major compaction will definitely reclaim space even i

Re: Reclaim deleted rows space

2011-01-03 Thread Peter Schuller
> Lets assume I have: > * single 100GB SSTable file > * min compaction threshold is set to 2 > If I delete rows which are located in this file. Is the only way to "clean" > the deleted rows is by inserting another 100GB of data or by triggering a > painful major compaction? Major compaction does i

Re: Reclaim deleted rows space

2011-01-02 Thread Adrian Cockcroft
mailto:shim...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Sun, 2 Jan 2011 11:25:42 -0800 To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apac

Reclaim deleted rows space

2011-01-02 Thread shimi
Lets assume I have: * single 100GB SSTable file * min compaction threshold is set to 2 If I delete rows which are located in this file. Is the only way to "clean" the deleted rows is by inserting another 100GB of data or by triggering a painful major compaction? Shimi