No. It's generally only an issue with heavy delete workloads, and it's sometimes possible to design around it.
cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 5/10/2011, at 1:18 PM, Daning wrote: > Thanks. Do you have plan to improve this? I think tombstone should be > separated with live data since it serves different purpose, built in separate > SSTable or indexed differently. It is pretty costly to do filtering while > reading. > > Daning > > On 10/04/2011 01:34 PM, aaron morton wrote: >> >> I would not get gc_grace seconds to 0, set to to something small. >> >> gc_grace_seconds or ttl is only the minimum amount of time the column will >> stay in the data files. The columns are only purged when compaction runs >> some time after that timespan has ended. >> >> If you are seeing issues where a heavy delete workload is having an >> noticeably adverse effect on read performance then you should look at the >> data model. Consider ways to spread the write / read / delete workload over >> multiple rows. >> >> If you cannot get away from it then experiment with reducing the >> min_compactioon_threshold of the CF's so that compaction kicks in quicker, >> and (potentially) tombstones are purged faster. >> >> Chees >> >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 5/10/2011, at 6:03 AM, Daning wrote: >> >>> Thanks Aaron. How about I set the gc_grace_seconds to 0 or like 2 hours? I >>> like to clean up tomebstone sooner, I don't care losing some data and all >>> my columns have ttl. >>> >>> If one node is down longer than gc_grace_seconds, and I got tombstone >>> removed, once the node is up, from my understanding deleted data will be >>> synced back. In this case my data will be processed twice and it will not >>> be a big deal to me. >>> >>> Thanks, >>> >>> Daning >>> >>> >>> On 10/04/2011 01:27 AM, aaron morton wrote: >>>> >>>> Yes that's the slice query skipping past the tombstone columns. >>>> >>>> Cheers >>>> >>>> ----------------- >>>> Aaron Morton >>>> Freelance Cassandra Developer >>>> @aaronmorton >>>> http://www.thelastpickle.com >>>> >>>> On 4/10/2011, at 4:24 PM, Daning Wang wrote: >>>> >>>>> Lots of SliceQueryFilter in the log, is that handling tombstone? >>>>> >>>>> DEBUG [ReadStage:49] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317582939743663:true:4@1317582939933000 >>>>> DEBUG [ReadStage:50] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317573253148778:true:4@1317573253354000 >>>>> DEBUG [ReadStage:43] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317669552951428:true:4@1317669553018000 >>>>> DEBUG [ReadStage:33] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317581886709261:true:4@1317581886957000 >>>>> DEBUG [ReadStage:52] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317568165152246:true:4@1317568165482000 >>>>> DEBUG [ReadStage:36] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317567265089211:true:4@1317567265405000 >>>>> DEBUG [ReadStage:53] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317674324843122:true:4@1317674324946000 >>>>> DEBUG [ReadStage:38] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317571990078721:true:4@1317571990141000 >>>>> DEBUG [ReadStage:57] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317671855234221:true:4@1317671855239000 >>>>> DEBUG [ReadStage:54] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317558305262954:true:4@1317558305337000 >>>>> DEBUG [RequestResponseStage:11] 2011-10-03 20:15:07,941 >>>>> ResponseVerbHandler.java (line 48) Processing response on a callback from >>>>> 12347@/10.210.101.104 >>>>> DEBUG [RequestResponseStage:9] 2011-10-03 20:15:07,941 >>>>> AbstractRowResolver.java (line 66) Preprocessed data response >>>>> DEBUG [RequestResponseStage:13] 2011-10-03 20:15:07,941 >>>>> AbstractRowResolver.java (line 66) Preprocessed digest response >>>>> DEBUG [ReadStage:58] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317581337972739:true:4@1317581338044000 >>>>> DEBUG [ReadStage:64] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317582656796332:true:4@1317582656970000 >>>>> DEBUG [ReadStage:55] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317569432886284:true:4@1317569432984000 >>>>> DEBUG [ReadStage:45] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317572658687019:true:4@1317572658718000 >>>>> DEBUG [ReadStage:47] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317582281617755:true:4@1317582281717000 >>>>> DEBUG [ReadStage:48] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: 1317549607869226:true:4@1317549608118000 >>>>> DEBUG [ReadStage:34] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line >>>>> 123) collecting 0 of 1: >>>>> On Thu, Sep 29, 2011 at 2:17 PM, aaron morton <aa...@thelastpickle.com> >>>>> wrote: >>>>> As with any situation involving the un-dead, it really is the number of >>>>> Zombies, Mummies or Vampires that is the concern. >>>>> >>>>> If you delete data there will always be tombstones. If you have a delete >>>>> heavy workload there will be more tombstones. This is why implementing a >>>>> queue with cassandra is a bad idea. >>>>> >>>>> gc_grace_seconds (and column TTL) are the *minimum* about of time the >>>>> tombstones will stay in the data files, there is no maximum. >>>>> >>>>> Your read performance also depends on the number of SSTables the row is >>>>> spread over, see >>>>> http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/ >>>>> >>>>> If you really wanted to purge them then yes a repair and then major >>>>> compaction would be the way to go. Also consider if it's possible to >>>>> design the data model around the problem, e.g. partitioning rows by date. >>>>> IMHO I would look to make data model changes before implementing a >>>>> compaction policy, or consider if cassandra is the right store if you >>>>> have a delete heavy workload. >>>>> >>>>> Cheers >>>>> >>>>> >>>>> ----------------- >>>>> Aaron Morton >>>>> Freelance Cassandra Developer >>>>> @aaronmorton >>>>> http://www.thelastpickle.com >>>>> >>>>> On 30/09/2011, at 3:27 AM, Daning Wang wrote: >>>>> >>>>>> Jonathan/Aaron, >>>>>> >>>>>> Thank you guy's reply, I will change GCGracePeriod to 1 day to see what >>>>>> will happen. >>>>>> >>>>>> Is there a way to purge tombstones at anytime? because if tombstones >>>>>> affect performance, we want them to be purged right away, not after >>>>>> GCGracePeriod. We know all the nodes are up, and we can do repair first >>>>>> to make sure the consistency before purging. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Daning >>>>>> >>>>>> >>>>>> On Wed, Sep 28, 2011 at 5:22 PM, aaron morton <aa...@thelastpickle.com> >>>>>> wrote: >>>>>> if I had to guess I would say it was spending time handling tombstones. >>>>>> If you see it happen again, and are interested, turn the logging up to >>>>>> DEBUG and look for messages from something starting with "Slice" >>>>>> >>>>>> Minor (automatic) compaction will, over time, purge the tombstones. >>>>>> Until then reads must read discard the data deleted by the tombstones. >>>>>> If you perform a big (i.e. 100k's ) delete this can reduce performance >>>>>> until compaction does it's thing. >>>>>> >>>>>> My second guess would be read repair (or the simple consistency checks >>>>>> on read) kicking in. That would show up in the "ReadRepairStage" in >>>>>> TPSTATS >>>>>> >>>>>> it may have been neither of those two things, just guesses. If you have >>>>>> more issues let us know and provide some more info. >>>>>> >>>>>> Cheers >>>>>> >>>>>> >>>>>> ----------------- >>>>>> Aaron Morton >>>>>> Freelance Cassandra Developer >>>>>> @aaronmorton >>>>>> http://www.thelastpickle.com >>>>>> >>>>>> On 29/09/2011, at 6:35 AM, Daning wrote: >>>>>> >>>>>> > I have an app polling a few CFs (select first N * from CF), there were >>>>>> > data in CFs but later were deleted so CFs were empty for a long time. >>>>>> > I found Cassandra CPU usage was getting high to 80%, normally it uses >>>>>> > less than 30%. I issued the select query manually and feel the >>>>>> > response is slow. I have tried nodetool compact/repair for those CFs >>>>>> > but that does not work. later, I issue 'truncate' for all the CFs and >>>>>> > CPU usage gets down to 1%. >>>>>> > >>>>>> > Can somebody explain to me why I need to truncate an empty CF? and >>>>>> > what else I could do to bring the CPU usage down? >>>>>> > >>>>>> > I am running 0.8.6. >>>>>> > >>>>>> > Thanks, >>>>>> > >>>>>> > Daning >>>>>> > >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >