On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> Pretty sure there's logic in there that says "don't bother compacting > a single sstable." No. You can do it. Based on the log I have a feeling that it triggers an infinite compaction loop. > On Wed, Jan 5, 2011 at 2:26 PM, shimi <shim...@gmail.com> wrote: > > How does minor compaction is triggered? Is it triggered Only when a new > > SStable is added? > > > > I was wondering if triggering a compaction > with minimumCompactionThreshold > > set to 1 would be useful. If this can happen I assume it will do > compaction > > on files with similar size and remove deleted rows on the rest. > > Shimi > > On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller < > peter.schul...@infidyne.com> > > wrote: > >> > >> > I don't have a problem with disk space. I have a problem with the data > >> > size. > >> > >> [snip] > >> > >> > Bottom line is that I want to reduce the number of requests that goes > to > >> > disk. Since there is enough data that is no longer valid I can do it > by > >> > reclaiming the space. The only way to do it is by running Major > >> > compaction. > >> > I can wait and let Cassandra do it for me but then the data size will > >> > get > >> > even bigger and the response time will be worst. I can do it manually > >> > but I > >> > prefer it to happen in the background with less impact on the system > >> > >> Ok - that makes perfect sense then. Sorry for misunderstanding :) > >> > >> So essentially, for workloads that are teetering on the edge of cache > >> warmness and is subject to significant overwrites or removals, it may > >> be beneficial to perform much more aggressive background compaction > >> even though it might waste lots of CPU, to keep the in-memory working > >> set down. > >> > >> There was talk (I think in the compaction redesign ticket) about > >> potentially improving the use of bloom filters such that obsolete data > >> in sstables could be eliminated from the read set without > >> necessitating actual compaction; that might help address cases like > >> these too. > >> > >> I don't think there's a pre-existing silver bullet in a current > >> release; you probably have to live with the need for > >> greater-than-theoretically-optimal memory requirements to keep the > >> working set in memory. > >> > >> -- > >> / Peter Schuller > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >