Pretty sure there's logic in there that says "don't bother compacting
a single sstable."

On Wed, Jan 5, 2011 at 2:26 PM, shimi <shim...@gmail.com> wrote:
> How does minor compaction is triggered? Is it triggered Only when a new
> SStable is added?
>
> I was wondering if triggering a compaction with minimumCompactionThreshold
> set to 1 would be useful. If this can happen I assume it will do compaction
> on files with similar size and remove deleted rows on the rest.
> Shimi
> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <peter.schul...@infidyne.com>
> wrote:
>>
>> > I don't have a problem with disk space. I have a problem with the data
>> > size.
>>
>> [snip]
>>
>> > Bottom line is that I want to reduce the number of requests that goes to
>> > disk. Since there is enough data that is no longer valid I can do it by
>> > reclaiming the space. The only way to do it is by running Major
>> > compaction.
>> > I can wait and let Cassandra do it for me but then the data size will
>> > get
>> > even bigger and the response time will be worst. I can do it manually
>> > but I
>> > prefer it to happen in the background with less impact on the system
>>
>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>
>> So essentially, for workloads that are teetering on the edge of cache
>> warmness and is subject to significant overwrites or removals, it may
>> be beneficial to perform much more aggressive background compaction
>> even though it might waste lots of CPU, to keep the in-memory working
>> set down.
>>
>> There was talk (I think in the compaction redesign ticket) about
>> potentially improving the use of bloom filters such that obsolete data
>> in sstables could be eliminated from the read set without
>> necessitating actual compaction; that might help address cases like
>> these too.
>>
>> I don't think there's a pre-existing silver bullet in a current
>> release; you probably have to live with the need for
>> greater-than-theoretically-optimal memory requirements to keep the
>> working set in memory.
>>
>> --
>> / Peter Schuller
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to