> 1) As one can imagine, the index and bloom filter for this column family is > large. Am I correct to assume that bloom filter and index space will not be > reduced until after gc_grace_period? Yes.
> 2) If I would manually run repair across a cluster, is there a process I can > use to safely remove these tombstones before gc_grace period to free this > memory sooner? There is nothing to specifically purge tombstones. You can temporarily reduce the gc_grace_seconds and then trigger compaction. Either by reducing the min_compaction_threshold to 2 and doing a flush. Or by kicking of a user defined compaction using the JMX interface. > 3) Any words of warning when undergoing this? Make sure you have a good breakfast. (It's more general advice than Cassandra specific.) Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 30/12/2012, at 8:51 AM, Mike <mthero...@yahoo.com> wrote: > Hello, > > We are undergoing a change to our internal datamodel that will result in the > eventual deletion of over a hundred million rows from a Cassandra column > family. From what I understand, this will result in the generation of > tombstones, which will be cleaned up during compaction, after gc_grace_period > time (default: 10 days). > > A couple of questions: > > 1) As one can imagine, the index and bloom filter for this column family is > large. Am I correct to assume that bloom filter and index space will not be > reduced until after gc_grace_period? > > 2) If I would manually run repair across a cluster, is there a process I can > use to safely remove these tombstones before gc_grace period to free this > memory sooner? > > 3) Any words of warning when undergoing this? > > We are running Cassandra 1.1.2 on a 6 node cluster and a Replication Factor > of 3. We use LOCAL_QUORM consistency for all operations. > > Thanks! > -Mike