> https://issues.apache.org/jira/browse/CASSANDRA-4671
Thanks for the tip. 

A

-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/07/2013, at 8:47 PM, Michał Michalski <mich...@opera.com> wrote:

> Hi Aaron,
> 
> > * Tombstones will only be purged if all fragments of a row are in the 
> > SStable(s) being compacted.
> 
> According to my knowledge it's not necessarily true. In a specific case this 
> patch comes into play:
> 
> https://issues.apache.org/jira/browse/CASSANDRA-4671
> 
> "We could however purge tombstone if we know that the non-compacted sstables 
> doesn't have any info that is older than the tombstones we're about to purge 
> (since then we know that the tombstones we'll consider can't delete data in 
> non compacted sstables)."
> 
> M.
> 
> W dniu 12.07.2013 10:25, aaron morton pisze:
>> That sounds sane to me. Couple of caveats:
>> 
>> * Remember that Expiring Columns turn into Tombstones and can only be purged 
>> after TTL and gc_grace.
>> * Tombstones will only be purged if all fragments of a row are in the 
>> SStable(s) being compacted.
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 11/07/2013, at 10:17 PM, Theo Hultberg <t...@iconara.net> wrote:
>> 
>>> a colleague of mine came up with an alternative solution that also seems to 
>>> work, and I'd just like your opinion on if it's sound.
>>> 
>>> we run find to list all old sstables, and then use cmdline-jmxclient to run 
>>> the forceUserDefinedCompaction function on each of them, this is roughly 
>>> what we do (but with find and xargs to orchestrate it)
>>> 
>>>   java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199 
>>> org.apache.cassandra.db:type=CompactionManager 
>>> forceUserDefinedCompaction=the_keyspace,db_file_name
>>> 
>>> the downside is that c* needs to read the file and do disk io, but the 
>>> upside is that it doesn't require a restart. c* does a little more work, 
>>> but we can schedule that during off-peak hours. another upside is that it 
>>> feels like we're pretty safe from screwups, we won't accidentally remove an 
>>> sstable with live data, the worst case is that we ask c* to compact an 
>>> sstable with live data and end up with an identical sstable.
>>> 
>>> if anyone else wants to do the same thing, this is the full cron command:
>>> 
>>> 0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type 
>>> f -name '*-Data.db' -mtime +8 -printf 
>>> "forceUserDefinedCompaction=the_keyspace_name,\%P\n" | xargs -t 
>>> --no-run-if-empty java -jar 
>>> /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - localhost:7199 
>>> org.apache.cassandra.db:type=CompactionManager
>>> 
>>> just change the keyspace name and the path to the data directory.
>>> 
>>> T#
>>> 
>>> 
>>> On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg <t...@iconara.net> wrote:
>>> thanks a lot. I can confirm that it solved our problem too.
>>> 
>>> looks like the C* 2.0 feature is perfect for us.
>>> 
>>> T#
>>> 
>>> 
>>> On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson <krum...@gmail.com> wrote:
>>> yep that works, you need to remove all components of the sstable though, 
>>> not just -Data.db
>>> 
>>> and, in 2.0 there is this:
>>> https://issues.apache.org/jira/browse/CASSANDRA-5228
>>> 
>>> /Marcus
>>> 
>>> 
>>> On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg <t...@iconara.net> wrote:
>>> Hi,
>>> 
>>> I think I remember reading that if you have sstables that you know contain 
>>> only data that whose ttl has expired, it's safe to remove them manually by 
>>> stopping c*, removing the *-Data.db files and then starting up c* again. is 
>>> this correct?
>>> 
>>> we have a cluster where everything is written with a ttl, and sometimes c* 
>>> needs to compact over a 100 gb of sstables where we know ever has expired, 
>>> and we'd rather just manually get rid of those.
>>> 
>>> T#
>>> 
>>> 
>>> 
>> 
>> 
> 

Reply via email to