Re: Reclaim deleted rows space

Edward Capriolo Wed, 05 Jan 2011 17:46:50 -0800

On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> Pretty sure there's logic in there that says "don't bother compacting
> a single sstable."
>
> On Wed, Jan 5, 2011 at 2:26 PM, shimi <shim...@gmail.com> wrote:
>> How does minor compaction is triggered? Is it triggered Only when a new
>> SStable is added?
>>
>> I was wondering if triggering a compaction with minimumCompactionThreshold
>> set to 1 would be useful. If this can happen I assume it will do compaction
>> on files with similar size and remove deleted rows on the rest.
>> Shimi
>> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <peter.schul...@infidyne.com>
>> wrote:
>>>
>>> > I don't have a problem with disk space. I have a problem with the data
>>> > size.
>>>
>>> [snip]
>>>
>>> > Bottom line is that I want to reduce the number of requests that goes to
>>> > disk. Since there is enough data that is no longer valid I can do it by
>>> > reclaiming the space. The only way to do it is by running Major
>>> > compaction.
>>> > I can wait and let Cassandra do it for me but then the data size will
>>> > get
>>> > even bigger and the response time will be worst. I can do it manually
>>> > but I
>>> > prefer it to happen in the background with less impact on the system
>>>
>>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>>
>>> So essentially, for workloads that are teetering on the edge of cache
>>> warmness and is subject to significant overwrites or removals, it may
>>> be beneficial to perform much more aggressive background compaction
>>> even though it might waste lots of CPU, to keep the in-memory working
>>> set down.
>>>
>>> There was talk (I think in the compaction redesign ticket) about
>>> potentially improving the use of bloom filters such that obsolete data
>>> in sstables could be eliminated from the read set without
>>> necessitating actual compaction; that might help address cases like
>>> these too.
>>>
>>> I don't think there's a pre-existing silver bullet in a current
>>> release; you probably have to live with the need for
>>> greater-than-theoretically-optimal memory requirements to keep the
>>> working set in memory.
>>>
>>> --
>>> / Peter Schuller
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


I was wording if it made sense to have a JMX operation that can
compact a list of tables by file name. This opens it up for power
users to have more options then compact entire keyspace.

Re: Reclaim deleted rows space

Reply via email to