Tombstone management is a big conversation, you can manage it in one of the following ways
1) set a gc_grace_seconds of 0 and then run nodetool compact while using size tiered compaction..as frequently as needed. This often is a pretty lousy solution as gc_grace_seconds means you're not very partition tolerant and it's easy to bring data back from the dead if you don't manage how you bring nodes back online correctly. Also..nodetool compact is super intensive. I don't recommend this approach unless you're already very operationally sound. 2)Partition your data using a scheme that matches your domain model. It sounds like you're using a queue approach and by and large a distributed database that relies on tombstones is going to struggle with that by default. I have however, worked with a number of customers that use cassandra for a queue at scale and I detailed the modeling workarounds here http://lostechies.com/ryansvihla/2014/10/20/domain-modeling-around-deletes-or-using-cassandra-as-a-queue-even-when-you-know-better/ On Tue, Jan 6, 2015 at 4:24 AM, Jens-U. Mozdzen <jmozd...@nde.ag> wrote: > Hi Eduardo, > > Zitat von Eduardo Cusa <eduardo.c...@usmediaconsulting.com>: > >> [...] >> I have to worry about the tombstones generated? Considering that I will >> have many daily set updates >> > > that depends on your definition of "many"... we've run into a situation > where we wanted to age out old data using TTL... unfortunately, we ran into > the "tombstone_failure_threshold" limit rather quickly, having thousands of > record updates per second. That left us with a CF containing millions of > records that we couldn't "select" the way we originally intended. > > Regards, > Jens > > -- Thanks, Ryan Svihla