thanks aaron, the second point I had not considered, and it could explain
why the sstables don't always disapear completely, sometimes a small file
(but megabytes instead of gigabytes) is left behind.

T#


On Fri, Jul 12, 2013 at 10:25 AM, aaron morton <aa...@thelastpickle.com>wrote:

> That sounds sane to me. Couple of caveats:
>
> * Remember that Expiring Columns turn into Tombstones and can only be
> purged after TTL and gc_grace.
> * Tombstones will only be purged if all fragments of a row are in the
> SStable(s) being compacted.
>
> Cheers
>
> -----------------
> Aaron Morton
> Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 11/07/2013, at 10:17 PM, Theo Hultberg <t...@iconara.net> wrote:
>
> a colleague of mine came up with an alternative solution that also seems
> to work, and I'd just like your opinion on if it's sound.
>
> we run find to list all old sstables, and then use cmdline-jmxclient to
> run the forceUserDefinedCompaction function on each of them, this is
> roughly what we do (but with find and xargs to orchestrate it)
>
>   java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199
> org.apache.cassandra.db:type=CompactionManager 
> forceUserDefinedCompaction=the_keyspace,db_file_name
>
> the downside is that c* needs to read the file and do disk io, but the
> upside is that it doesn't require a restart. c* does a little more work,
> but we can schedule that during off-peak hours. another upside is that it
> feels like we're pretty safe from screwups, we won't accidentally remove an
> sstable with live data, the worst case is that we ask c* to compact an
> sstable with live data and end up with an identical sstable.
>
> if anyone else wants to do the same thing, this is the full cron command:
>
> 0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type
> f -name '*-Data.db' -mtime +8 -printf
> "forceUserDefinedCompaction=the_keyspace_name,\%P\n" | xargs -t
> --no-run-if-empty java -jar
> /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - localhost:7199
> org.apache.cassandra.db:type=CompactionManager
>
> just change the keyspace name and the path to the data directory.
>
> T#
>
>
> On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg <t...@iconara.net> wrote:
>
>> thanks a lot. I can confirm that it solved our problem too.
>>
>> looks like the C* 2.0 feature is perfect for us.
>>
>> T#
>>
>>
>> On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson <krum...@gmail.com>wrote:
>>
>>> yep that works, you need to remove all components of the sstable though,
>>> not just -Data.db
>>>
>>> and, in 2.0 there is this:
>>> https://issues.apache.org/jira/browse/CASSANDRA-5228
>>>
>>> /Marcus
>>>
>>>
>>> On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg <t...@iconara.net> wrote:
>>>
>>>> Hi,
>>>>
>>>> I think I remember reading that if you have sstables that you know
>>>> contain only data that whose ttl has expired, it's safe to remove them
>>>> manually by stopping c*, removing the *-Data.db files and then starting up
>>>> c* again. is this correct?
>>>>
>>>> we have a cluster where everything is written with a ttl, and sometimes
>>>> c* needs to compact over a 100 gb of sstables where we know ever has
>>>> expired, and we'd rather just manually get rid of those.
>>>>
>>>> T#
>>>>
>>>
>>>
>>
>
>

Reply via email to