> https://issues.apache.org/jira/browse/CASSANDRA-4671 Thanks for the tip.
A ----------------- Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 17/07/2013, at 8:47 PM, Michał Michalski <mich...@opera.com> wrote: > Hi Aaron, > > > * Tombstones will only be purged if all fragments of a row are in the > > SStable(s) being compacted. > > According to my knowledge it's not necessarily true. In a specific case this > patch comes into play: > > https://issues.apache.org/jira/browse/CASSANDRA-4671 > > "We could however purge tombstone if we know that the non-compacted sstables > doesn't have any info that is older than the tombstones we're about to purge > (since then we know that the tombstones we'll consider can't delete data in > non compacted sstables)." > > M. > > W dniu 12.07.2013 10:25, aaron morton pisze: >> That sounds sane to me. Couple of caveats: >> >> * Remember that Expiring Columns turn into Tombstones and can only be purged >> after TTL and gc_grace. >> * Tombstones will only be purged if all fragments of a row are in the >> SStable(s) being compacted. >> >> Cheers >> >> ----------------- >> Aaron Morton >> Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 11/07/2013, at 10:17 PM, Theo Hultberg <t...@iconara.net> wrote: >> >>> a colleague of mine came up with an alternative solution that also seems to >>> work, and I'd just like your opinion on if it's sound. >>> >>> we run find to list all old sstables, and then use cmdline-jmxclient to run >>> the forceUserDefinedCompaction function on each of them, this is roughly >>> what we do (but with find and xargs to orchestrate it) >>> >>> java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199 >>> org.apache.cassandra.db:type=CompactionManager >>> forceUserDefinedCompaction=the_keyspace,db_file_name >>> >>> the downside is that c* needs to read the file and do disk io, but the >>> upside is that it doesn't require a restart. c* does a little more work, >>> but we can schedule that during off-peak hours. another upside is that it >>> feels like we're pretty safe from screwups, we won't accidentally remove an >>> sstable with live data, the worst case is that we ask c* to compact an >>> sstable with live data and end up with an identical sstable. >>> >>> if anyone else wants to do the same thing, this is the full cron command: >>> >>> 0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type >>> f -name '*-Data.db' -mtime +8 -printf >>> "forceUserDefinedCompaction=the_keyspace_name,\%P\n" | xargs -t >>> --no-run-if-empty java -jar >>> /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - localhost:7199 >>> org.apache.cassandra.db:type=CompactionManager >>> >>> just change the keyspace name and the path to the data directory. >>> >>> T# >>> >>> >>> On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg <t...@iconara.net> wrote: >>> thanks a lot. I can confirm that it solved our problem too. >>> >>> looks like the C* 2.0 feature is perfect for us. >>> >>> T# >>> >>> >>> On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson <krum...@gmail.com> wrote: >>> yep that works, you need to remove all components of the sstable though, >>> not just -Data.db >>> >>> and, in 2.0 there is this: >>> https://issues.apache.org/jira/browse/CASSANDRA-5228 >>> >>> /Marcus >>> >>> >>> On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg <t...@iconara.net> wrote: >>> Hi, >>> >>> I think I remember reading that if you have sstables that you know contain >>> only data that whose ttl has expired, it's safe to remove them manually by >>> stopping c*, removing the *-Data.db files and then starting up c* again. is >>> this correct? >>> >>> we have a cluster where everything is written with a ttl, and sometimes c* >>> needs to compact over a 100 gb of sstables where we know ever has expired, >>> and we'd rather just manually get rid of those. >>> >>> T# >>> >>> >>> >> >> >