Hey guys,

We're having a very strange issue: deleted columns get resurrected when
"repair" is run on a node.

Info about the setup. Cassandra 2.0.13, multi datacenter with 12 nodes in
one datacenter and 6 nodes in another one. Schema:

cqlsh> describe keyspace blackbook;

CREATE KEYSPACE blackbook WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'IAD': '3',
  'ORD': '3'
};

USE blackbook;

CREATE TABLE bounces (
  domainid text,
  address text,
  message text,
  "timestamp" bigint,
  PRIMARY KEY (domainid, address)
) WITH
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

We're using wide rows for the "bounces" table that can store hundreds of
thousands of addresses for each "domainid" (in practice it's much less
usually, but some rows may contain up to several million columns).

All queries are done using LOCAL_QUORUM consistency. Sometimes bounces are
deleted from the table using the following CQL3 statement:

delete from bounces where domainid = 'domain.com' and address = '
al...@example.com';

But the thing is, after "repair" is run on any node that owns "domain.com"
key, the column gets resurrected on all nodes as if the tombstone has
disappeared. We checked this multiple times using cqlsh: issue a delete
statement and verify that data is not returned; then run "repair" and the
deleted data is returned again.

Our gc_grace_seconds is of the default value and no nodes ever were down
for anywhere close to 10 days, so it doesn't look like it's related. We
also made sure all our servers are running ntpd so time synchronization
should not be an issue as well.

Have you guys ever seen anything like this / have any idea as to what may
be causing this behavior? What could make "tombstone" disappear during
"repair" operation?

Thanks for your help. Let me know if I can provide more information.

Roman

Reply via email to