Re: secondary index table - tombstones surviving compactions

Jeff Jirsa Fri, 18 May 2018 07:13:51 -0700

This would matter for the base table, but would be less likely for the 
secondary index, where the partition key is the value of the base row


Roman: there’s a config option related to only purging repaired tombstones - do 
you have that enabled ? If so, are you running repairs?

-- 
Jeff Jirsa


> On May 18, 2018, at 6:41 AM, Eric Stevens <migh...@gmail.com> wrote:
> 
> The answer to Question 3 is "yes."  One of the more subtle points about
> tombstones is that Cassandra won't remove them during compaction if there
> is a bloom filter on any SSTable on that replica indicating that it
> contains the same partition (not primary) key.  Even if it is older than
> gc_grace, and would otherwise be a candidate for cleanup.
> 
> If you're recycling partition keys, your tombstones may never be able to be
> cleaned up, because in this scenario there is a high probability that an
> SSTable not involved in that compaction also contains the same partition
> key, and so compaction cannot have confidence that it's safe to remove the
> tombstone (it would have to fully materialize every record in the
> compaction, which is too expensive).
> 
> In general it is an antipattern in Cassandra to write to a given partition
> indefinitely for this and other reasons.
> 
> On Fri, May 18, 2018 at 2:37 AM Roman Bielik <
> roman.bie...@openmindnetworks.com> wrote:
> 
>> Hi,
>> 
>> I have a Cassandra 3.11 table (with compact storage) and using secondary
>> indices with rather unique data stored in the indexed columns. There are
>> many inserts and deletes, so in order to avoid tombstones piling up I'm
>> re-using primary keys from a pool (which works fine).
>> I'm aware that this design pattern is not ideal, but for now I can not
>> change it easily.
>> 
>> The problem is, the size of 2nd index tables keeps growing (filled with
>> tombstones) no matter what.
>> 
>> I tried some aggressive configuration (just for testing) in order to
>> expedite the tombstone removal but with little-to-zero effect:
>> COMPACTION = { 'class':
>> 'LeveledCompactionStrategy', 'unchecked_tombstone_compaction': 'true',
>> 'tombstone_compaction_interval': 600 }
>> gc_grace_seconds = 600
>> 
>> I'm aware that perhaps Materialized views could provide a solution to this,
>> but I'm bind to the Thrift interface, so can not use them.
>> 
>> Questions:
>> 1. Is there something I'm missing? How come compaction does not remove the
>> obsolete indices/tombstones from 2nd index tables? Can I trigger the
>> cleanup manually somehow?
>> I have tried nodetool flush, compact, rebuild_index on both data table and
>> internal Index table, but with no result.
>> 
>> 2. When deleting a record I'm deleting the whole row at once - which would
>> create one tombstone for the whole record if I'm correct. Would it help to
>> delete the indexed columns separately creating extra tombstone for each
>> cell?
>> As I understand the underlying mechanism, the indexed column value must be
>> read in order a proper tombstone for the index is created for it.
>> 
>> 3. Could the fact that I'm reusing the primary key of a deleted record
>> shortly for a new insert interact with the secondary index tombstone
>> removal?
>> 
>> Will be grateful for any advice.
>> 
>> Regards,
>> Roman
>> 
>> --
>> <http://www.openmindnetworks.com>
>> <http://www.openmindnetworks.com/>
>> <https://www.linkedin.com/company/openmind-networks>
>> <https://twitter.com/Openmind_Ntwks>  <http://www.openmindnetworks.com/>
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: secondary index table - tombstones surviving compactions

Reply via email to