Hello,
We are undergoing a change to our internal datamodel that will result in
the eventual deletion of over a hundred million rows from a Cassandra
column family. From what I understand, this will result in the
generation of tombstones, which will be cleaned up during compaction,
after gc_grace_period time (default: 10 days).
A couple of questions:
1) As one can imagine, the index and bloom filter for this column family
is large. Am I correct to assume that bloom filter and index space will
not be reduced until after gc_grace_period?
2) If I would manually run repair across a cluster, is there a process I
can use to safely remove these tombstones before gc_grace period to free
this memory sooner?
3) Any words of warning when undergoing this?
We are running Cassandra 1.1.2 on a 6 node cluster and a Replication
Factor of 3. We use LOCAL_QUORM consistency for all operations.
Thanks!
-Mike