Hey guys, Has anyone seen anything like this behavior or has an explanation for it? If not, I think I'm gonna file a bug report.
Thanks! Roman On Mon, Mar 23, 2015 at 4:45 PM, Roman Tkachenko <ro...@mailgunhq.com> wrote: > Hey guys, > > We're having a very strange issue: deleted columns get resurrected when > "repair" is run on a node. > > Info about the setup. Cassandra 2.0.13, multi datacenter with 12 nodes in > one datacenter and 6 nodes in another one. Schema: > > cqlsh> describe keyspace blackbook; > > CREATE KEYSPACE blackbook WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'IAD': '3', > 'ORD': '3' > }; > > USE blackbook; > > CREATE TABLE bounces ( > domainid text, > address text, > message text, > "timestamp" bigint, > PRIMARY KEY (domainid, address) > ) WITH > bloom_filter_fp_chance=0.100000 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.100000 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.000000 AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='99.0PERCENTILE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'LeveledCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > > We're using wide rows for the "bounces" table that can store hundreds of > thousands of addresses for each "domainid" (in practice it's much less > usually, but some rows may contain up to several million columns). > > All queries are done using LOCAL_QUORUM consistency. Sometimes bounces are > deleted from the table using the following CQL3 statement: > > delete from bounces where domainid = 'domain.com' and address = ' > al...@example.com'; > > But the thing is, after "repair" is run on any node that owns "domain.com" > key, the column gets resurrected on all nodes as if the tombstone has > disappeared. We checked this multiple times using cqlsh: issue a delete > statement and verify that data is not returned; then run "repair" and the > deleted data is returned again. > > Our gc_grace_seconds is of the default value and no nodes ever were down > for anywhere close to 10 days, so it doesn't look like it's related. We > also made sure all our servers are running ntpd so time synchronization > should not be an issue as well. > > Have you guys ever seen anything like this / have any idea as to what may > be causing this behavior? What could make "tombstone" disappear during > "repair" operation? > > Thanks for your help. Let me know if I can provide more information. > > Roman >