Hi Jordan, thank you for accepting this as an issue. I will follow the ticket.
Best regards, Roman On 30 May 2018 at 11:40, Jordan West <jorda...@gmail.com> wrote: > Hi Roman, > > I was able to reproduce the issue you described. I filed > https://issues.apache.org/jira/browse/CASSANDRA-14479. More details there. > > Thanks for reporting! > Jordan > > > On Wed, May 23, 2018 at 12:06 AM, Roman Bielik < > roman.bie...@openmindnetworks.com> wrote: > > > Hi, > > > > I apologise for a late response I wanted to run some further tests so I > can > > provide more information to you. > > > > @Jeff, no I don't set the "only_purge_repaired_tombstone" option. It > > should > > be default: False. > > But no I don't run repairs during the tests. > > > > @Eric, I understand that rapid deletes/inserts are some kind of > > antipattern, nevertheless I'm not experiencing any problems with that > > (except for the 2nd indices). > > > > Update: I run a new test where I delete the indexed columns extra, plus > > delete the whole row at the end. > > And surprisingly this test scenario works fine. Using nodetool flush + > > compact (in order to expedite the test) seems to always purge the index > > table. > > So that's great because I seem to have found a workaround, on the other > > hand, could there be a bug in Cassandra - leaking index table? > > > > Test details: > > Create table with LeveledCompactionStrategy; > > 'tombstone_compaction_interval': 60; gc_grace_seconds=60 > > There are two indexed columns for comparison: column1, column2 > > Insert keys {1..x} with random values in column1 & column2 > > Delete {key:column2} (but not column1) > > Delete {key} > > Repeat n-times from the inserts > > Wait 1 minute > > nodetool flush > > nodetool compact (sometimes compact <keyspace> <table.index> > > nodetool cfstats > > > > What I observe is, that the data table is empty, column2 index table is > > also empty and column1 index table has non-zero (leaked) "space used" and > > "estimated rows". > > > > Roman > > > > > > > > > > > > > > On 18 May 2018 at 16:13, Jeff Jirsa <jji...@gmail.com> wrote: > > > > > This would matter for the base table, but would be less likely for the > > > secondary index, where the partition key is the value of the base row > > > > > > Roman: there’s a config option related to only purging repaired > > tombstones > > > - do you have that enabled ? If so, are you running repairs? > > > > > > -- > > > Jeff Jirsa > > > > > > > > > > On May 18, 2018, at 6:41 AM, Eric Stevens <migh...@gmail.com> wrote: > > > > > > > > The answer to Question 3 is "yes." One of the more subtle points > about > > > > tombstones is that Cassandra won't remove them during compaction if > > there > > > > is a bloom filter on any SSTable on that replica indicating that it > > > > contains the same partition (not primary) key. Even if it is older > > than > > > > gc_grace, and would otherwise be a candidate for cleanup. > > > > > > > > If you're recycling partition keys, your tombstones may never be able > > to > > > be > > > > cleaned up, because in this scenario there is a high probability that > > an > > > > SSTable not involved in that compaction also contains the same > > partition > > > > key, and so compaction cannot have confidence that it's safe to > remove > > > the > > > > tombstone (it would have to fully materialize every record in the > > > > compaction, which is too expensive). > > > > > > > > In general it is an antipattern in Cassandra to write to a given > > > partition > > > > indefinitely for this and other reasons. > > > > > > > > On Fri, May 18, 2018 at 2:37 AM Roman Bielik < > > > > roman.bie...@openmindnetworks.com> wrote: > > > > > > > >> Hi, > > > >> > > > >> I have a Cassandra 3.11 table (with compact storage) and using > > secondary > > > >> indices with rather unique data stored in the indexed columns. There > > are > > > >> many inserts and deletes, so in order to avoid tombstones piling up > > I'm > > > >> re-using primary keys from a pool (which works fine). > > > >> I'm aware that this design pattern is not ideal, but for now I can > not > > > >> change it easily. > > > >> > > > >> The problem is, the size of 2nd index tables keeps growing (filled > > with > > > >> tombstones) no matter what. > > > >> > > > >> I tried some aggressive configuration (just for testing) in order to > > > >> expedite the tombstone removal but with little-to-zero effect: > > > >> COMPACTION = { 'class': > > > >> 'LeveledCompactionStrategy', 'unchecked_tombstone_compaction': > > 'true', > > > >> 'tombstone_compaction_interval': 600 } > > > >> gc_grace_seconds = 600 > > > >> > > > >> I'm aware that perhaps Materialized views could provide a solution > to > > > this, > > > >> but I'm bind to the Thrift interface, so can not use them. > > > >> > > > >> Questions: > > > >> 1. Is there something I'm missing? How come compaction does not > remove > > > the > > > >> obsolete indices/tombstones from 2nd index tables? Can I trigger the > > > >> cleanup manually somehow? > > > >> I have tried nodetool flush, compact, rebuild_index on both data > table > > > and > > > >> internal Index table, but with no result. > > > >> > > > >> 2. When deleting a record I'm deleting the whole row at once - which > > > would > > > >> create one tombstone for the whole record if I'm correct. Would it > > help > > > to > > > >> delete the indexed columns separately creating extra tombstone for > > each > > > >> cell? > > > >> As I understand the underlying mechanism, the indexed column value > > must > > > be > > > >> read in order a proper tombstone for the index is created for it. > > > >> > > > >> 3. Could the fact that I'm reusing the primary key of a deleted > record > > > >> shortly for a new insert interact with the secondary index tombstone > > > >> removal? > > > >> > > > >> Will be grateful for any advice. > > > >> > > > >> Regards, > > > >> Roman > > > >> > > > >> -- > > > >> <http://www.openmindnetworks.com> > > > >> <http://www.openmindnetworks.com/> > > > >> <https://www.linkedin.com/company/openmind-networks> > > > >> <https://twitter.com/Openmind_Ntwks> <http://www.openmindnetworks. > > com/ > > > > > > > >> > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > -- > > <http://www.openmindnetworks.com> > > <http://www.openmindnetworks.com/> > > <https://www.linkedin.com/company/openmind-networks> > > <https://twitter.com/Openmind_Ntwks> <http://www.openmindnetworks.com/> > > > -- <http://www.openmindnetworks.com> <http://www.openmindnetworks.com/> <https://www.linkedin.com/company/openmind-networks> <https://twitter.com/Openmind_Ntwks> <http://www.openmindnetworks.com/>