To clarify, you do not need a ttl for deletes to be compacted away in
Cassandra. When you delete, we create a tombstone which will remain in the
system __at least__ gc grace seconds. We wait this long to give the
tombstone a chance to make it to all replica nodes, the best practice is to
run repairs as often as gc grace seconds in order to ensure edge cases
where data comes back to life (i.e. the tombstone was never sent to one of
your replicas and when the tombstones and data are removed from the other
two replicas, all that is left is the old value.

__at least__ are the key words in the previous paragraph, there are more
conditions that need to be met in order for a tombstone to actually get
cleaned up. As most things in Cassandra, these conditions are configurable
(via the following compaction sub-properties):

http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_configure_compaction_t.html

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Aug 20, 2015 at 4:13 PM, Daniel Chia <danc...@coursera.org> wrote:

> The TTL shouldn't matter if you deleted the data, since to my
> understanding the delete should shadow the data signaling to C* that the
> data is a candidate for removal on compaction.
>
> Others might know better, but it could very well be the fact that
> gc_grace_seconds is 0 that is causing your problems. Others might have
> other suggestions, but you could potentially use sstable2json to see the
> raw contents of the sstable on disk and see why data is still there.
>
> Thanks,
> Daniel
>
> On Thu, Aug 20, 2015 at 12:55 PM, Analia Lorenzatto <
> analialorenza...@gmail.com> wrote:
>
>> Hello,
>>
>> Daniel, I am using Size Tiered compaction.
>>
>> My concern is that as I do not have a TTL defined on the Column family,
>> and I do not have the possibility to create it.   Perhaps, the "deleted
>> data" is never actually going to be removed?
>>
>> Thanks a lot!
>>
>>
>> On Thu, Aug 20, 2015 at 4:24 AM, Daniel Chia <danc...@coursera.org>
>> wrote:
>>
>>> Is this a LCS family, or Size Tiered? Manually running compaction on LCS
>>> doesn't do anything until C* 2.2 (
>>> https://issues.apache.org/jira/browse/CASSANDRA-7272)
>>>
>>> Thanks,
>>> Daniel
>>>
>>> On Wed, Aug 19, 2015 at 6:56 PM, Analia Lorenzatto <
>>> analialorenza...@gmail.com> wrote:
>>>
>>>> Hello Michael,
>>>>
>>>> Thanks for responding!
>>>>
>>>> I do not have snapshots on any node of the cluster.
>>>>
>>>> Saludos / Regards.
>>>>
>>>> Analía Lorenzatto.
>>>>
>>>> "Hapiness is not something really made. It comes from your own actions"
>>>> by Dalai Lama
>>>>
>>>>
>>>> On 19 Aug 2015 6:19 pm, "Laing, Michael" <michael.la...@nytimes.com>
>>>> wrote:
>>>>
>>>>> Possibly you have snapshots? If so, use nodetool to clear them.
>>>>>
>>>>> On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto <
>>>>> analialorenza...@gmail.com> wrote:
>>>>>
>>>>>> Hello guys,
>>>>>>
>>>>>> I have a cassandra cluster 2.1 comprised of 4 nodes.
>>>>>>
>>>>>> I removed a lot of data in a Column Family, then I ran manually a
>>>>>> compaction on this Column family on every node.   After doing that, If I
>>>>>> query that data, cassandra correctly says this data is not there.  But 
>>>>>> the
>>>>>> space on disk is exactly the same before removing that data.
>>>>>>
>>>>>> Also, I realized that  gc_grace_seconds = 0.  Some people on the
>>>>>> internet say that it could produce zombie data, what do you think?
>>>>>>
>>>>>> I do not have a TTL defined on the Column family, and I do not have
>>>>>> the possibility to create it.   So my questions is, given that I do not
>>>>>> have a TTL defined is data going to be removed?  or the deleted data is
>>>>>> never actually going to be deleted due to I do not have a TTL?
>>>>>>
>>>>>>
>>>>>> Thanks in advance!
>>>>>>
>>>>>> --
>>>>>> Saludos / Regards.
>>>>>>
>>>>>> Analía Lorenzatto.
>>>>>>
>>>>>> “It's possible to commit no errors and still lose. That is not
>>>>>> weakness.  That is life".  By Captain Jean-Luc Picard.
>>>>>>
>>>>>
>>>>>
>>>
>>
>>
>> --
>> Saludos / Regards.
>>
>> Analía Lorenzatto.
>>
>> “It's possible to commit no errors and still lose. That is not weakness.
>> That is life".  By Captain Jean-Luc Picard.
>>
>
>

Reply via email to