If you never delete except by ttl, and always write with the same ttl (or
monotonically increasing), you can set gc_grace_seconds to 0.

That's what we do. There have been discussions on the list over the last
few years re this topic.

ml

On Tue, Apr 21, 2015 at 11:14 AM, Walsh, Stephen <stephen.wa...@aspect.com>
wrote:

>  We were chatting to Jon Haddena about a week ago about our tombstone
> issue using Cassandra 2.0.14
>
> To Summarize
>
>
>
> We have a 3 node cluster with replication-factor=3 and compaction =
> SizeTiered
>
> We use 1 keyspace with 1 table
>
> Each row have about 40 columns
>
> Each row has a TTL of 10 seconds
>
>
>
> We insert about 500 rows per second in a prepared batch** (about 3mb in
> network overhead)
>
> We query the entire table once per second
>
>
>
> **This is too enable consistent data, E.G batch in transactional, so we
> get all queried data from one insert and not a mix of 2 or more.
>
>
>
>
>
> Seems every second we insert, the rows are never deleted by the TTL, or so
> we thought.
>
> After some time we got this message on the query side
>
>
>
>
>
> #######################################
>
> ERROR [ReadStage:91] 2015-04-21 12:27:03,902 SliceQueryFilter.java (line
> 206) Scanned over 100000 tombstones in keyspace.table; query aborted (see
> tombstone_failure_threshold)
>
> ERROR [ReadStage:91] 2015-04-21 12:27:03,931 CassandraDaemon.java (line
> 199) Exception in thread Thread[ReadStage:91,5,main]
>
> java.lang.RuntimeException:
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>
>                 at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2008)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
>                 at java.lang.Thread.run(Thread.java:745)
>
> Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>
> #######################################
>
>
>
>
>
> So we know tombstones are infact being created.
>
> Solution was to change the table schema and set gc_grace_seconds to run
> every 60 seconds.
>
> This worked for 20 seconds, then we saw this
>
>
>
>
>
> #######################################
>
> Read 500 live and 30000 tombstoned cells in keyspace.table (see
> tombstone_warn_threshold). 10000 columns was requested, slices=[-],
> delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}
>
> #######################################
>
>
>
> So every 20 seconds (500 inserts x 20 seconds = 10,000 tombstones)
>
> So now we have the gc_grace_seconds set to 10 seoncds.
>
> But its feels very wrong to have it at a low number, especially if we move
> to a larger cluster. This just wont fly.
>
> What are we doing wrong?
>
>
>
> We shouldn’t increase the tombstone threshold as that is extremely
> dangerous.
>
>
>
>
>
> Best Regards
>
> Stephen Walsh
>
>
>
>
>
>
>
>
>
>
>
>
>  This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>

Reply via email to