thanks Jens and Ryan, is clear to me what happens with tombstones for a CF row
Now, the same behavior that apply to CF rows also apply to elements in a set Data type? Regards On Tue, Jan 6, 2015 at 12:31 PM, Ryan Svihla <r...@foundev.pro> wrote: > Tombstone management is a big conversation, you can manage it in one of > the following ways > > 1) set a gc_grace_seconds of 0 and then run nodetool compact while using > size tiered compaction..as frequently as needed. This often is a pretty > lousy solution as gc_grace_seconds means you're not very partition tolerant > and it's easy to bring data back from the dead if you don't manage how you > bring nodes back online correctly. Also..nodetool compact is super > intensive. I don't recommend this approach unless you're already very > operationally sound. > 2)Partition your data using a scheme that matches your domain model. It > sounds like you're using a queue approach and by and large a distributed > database that relies on tombstones is going to struggle with that by > default. I have however, worked with a number of customers that use > cassandra for a queue at scale and I detailed the modeling workarounds here > http://lostechies.com/ryansvihla/2014/10/20/domain-modeling-around-deletes-or-using-cassandra-as-a-queue-even-when-you-know-better/ > > On Tue, Jan 6, 2015 at 4:24 AM, Jens-U. Mozdzen <jmozd...@nde.ag> wrote: > >> Hi Eduardo, >> >> Zitat von Eduardo Cusa <eduardo.c...@usmediaconsulting.com>: >> >>> [...] >>> I have to worry about the tombstones generated? Considering that I will >>> have many daily set updates >>> >> >> that depends on your definition of "many"... we've run into a situation >> where we wanted to age out old data using TTL... unfortunately, we ran into >> the "tombstone_failure_threshold" limit rather quickly, having thousands of >> record updates per second. That left us with a CF containing millions of >> records that we couldn't "select" the way we originally intended. >> >> Regards, >> Jens >> >> > > > -- > > Thanks, > Ryan Svihla > >