I read through the ticket a few number of times. We have replication factor 3 and LocalQuorum.
Do we still think CASSANDRA-15690 is a possibility with RF = 3? ________________________________ From: shankha b <shankha-ms-wor...@outlook.com> Sent: Saturday, February 4, 2023 9:29 PM To: user@cassandra.apache.org <user@cassandra.apache.org> Subject: Re: Deletions getting omitted I will look into raising the gc_grace_seconds. We are using LocalQuorum for all reads and writes. We do not use ALL exactly for outage reasons. ________________________________ From: Jeff Jirsa <jji...@gmail.com> Sent: Saturday, February 4, 2023 8:44 PM To: user@cassandra.apache.org <user@cassandra.apache.org> Subject: Re: Deletions getting omitted While you'd expect only_purge_repaired_tombstones:true to be sufficient, your gc_grace_secnds of 1 hour is making you unusually susceptible to resurrecting data. (To be clear, you should be safe to do this, but if there is a bug hiding in there somewhere, your low gc_grace_seconds will make it likely to resurrect; if this is causing you problems, I'd try raising that first to mitigate while you investigate the real cause). If it's CASSANDRA-15690, a second read at consistency ALL may cause the data to properly show up "deleted" (you dont want to use ALL all the time, because it'll be an outage if you ever have a node go down). Given CASSANDRA-15690 exists, you probably want to upgrade. On Sat, Feb 4, 2023 at 4:56 PM shankha b <shankha-ms-wor...@outlook.com<mailto:shankha-ms-wor...@outlook.com>> wrote: We are facing an issue on one of our production systems where after we delete the data the data doesn't seem to get deleted. We have a Get call just after the delete call. The data shows up. Versions cassandra : 3.11.6 gocqlx : v2 v2.1.0 1. Client Settings: LocalQuorum 2. Number of Nodes : 3 3. All 3 nodes up and running for weeks. 4. Inserts were done few days earlier. So there is good amount of time difference between Inserts and Deletes and Inserts have made through successfully. The Delete Call : q := s.session.Query(stmt, names).BindStruct(*customModel) err := q.ExecRelease() We do check the error and it is Nil. There are no exceptions during that time either on the client side or server side. The Get Call : q := s.session.Query(stmt, names).BindStruct(*customModel) err := q.GetRelease(customModel) This returns the data successfully. We do have these two options enabled. 1. https://docs.datastax.com/en/dse/6.8/dse-dev/datastax_enterprise/config/configCassandra_yaml.html#configCassandra_yaml__commitlog_sync batch - Send ACK signal for writes after the commit log has been flushed to disk. Each incoming write triggers the flush task. 2. only_purge_repaired_tombstones This does not happen for all the delete operations. For many of them, the delete seems to go through. This does not seem to be timing-related and the successful and unsuccessful ones are spread out. CASSANDRA-15690 Single partition queries can mistakenly omit partition deletions and resurrect data I am trying to go through this PR and ticket. If you have any suggestions, please do let me know. The table structure is the following CREATE KEYSPACE cycling WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; CREATE TABLE cycling.rider ( uuid text, created_at timestamp, PRIMARY KEY (uuid, created_at) ) WITH CLUSTERING ORDER BY (created_at DESC) AND WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4', 'only_purge_repaired_tombstones': 'true'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 3600 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; Thanks