You should find that the patch will apply cleanly to the 2.0.5 release, so you could apply it yourself.
On 5 February 2014 18:56, Robert Wille <rwi...@fold3.com> wrote: > Thank you so much. Everything I had seen pointed to this being the case. > I'm glad that someone in the know has confirmed this bug and fixed it. Now > I just need to figure out where to go from here: do I wait, use the dev > branch or work around. > > Robert > > From: Benedict Elliott Smith <belliottsm...@datastax.com> > Reply-To: <user@cassandra.apache.org> > Date: Wednesday, February 5, 2014 at 8:32 AM > > To: <user@cassandra.apache.org> > Subject: Re: Lots of deletions results in death by GC > > I believe there is a bug, and I have filed a ticket for it: > https://issues.apache.org/jira/browse/CASSANDRA-6655 > > I will have a patch uploaded shortly, but it's just missed the 2.0.5 > release window, so you'll either need to grab the development branch once > it's committed or wait until 2.0.6 > > > On 5 February 2014 15:09, Robert Wille <rwi...@fold3.com> wrote: > >> Yes. It's kind of an unusual workload. An insertion phase followed by a >> deletion phase, generally not overlapping. >> >> From: Benedict Elliott Smith <belliottsm...@datastax.com> >> Reply-To: <user@cassandra.apache.org> >> Date: Tuesday, February 4, 2014 at 5:29 PM >> To: <user@cassandra.apache.org> >> >> Subject: Re: Lots of deletions results in death by GC >> >> Is it possible you are generating *exclusively* deletes for this table? >> >> >> On 5 February 2014 00:10, Robert Wille <rwi...@fold3.com> wrote: >> >>> I ran my test again, and Flush Writer's "All time blocked" increased to >>> 2 and then shortly thereafter GC went into its death spiral. I doubled >>> memtable_flush_writers (to 2) and memtable_flush_queue_size (to 8) and >>> tried again. >>> >>> This time, the table that always sat with Memtable data size = 0 now >>> showed increases in Memtable data size. That was encouraging. It never >>> flushed, which isn't too surprising, because that table has relatively few >>> rows and they are pretty wide. However, on the fourth table to clean, Flush >>> Writer's "All time blocked" went to 1, and then there were no more >>> completed events, and about 10 minutes later GC went into its death spiral. >>> I assume that each time Flush Writer completes an event, that means a table >>> was flushed. Is that right? Also, I got two dropped mutation messages at >>> the same time that Flush Writer's All time blocked incremented. >>> >>> I then increased the writers and queue size to 3 and 12, respectively, >>> and ran my test again. This time All time blocked remained at 0, but I >>> still suffered death by GC. >>> >>> I would almost think that this is caused by high load on the server, but >>> I've never seen CPU utilization go above about two of my eight available >>> cores. If high load triggers this problem, then that is very disconcerting. >>> That means that a CPU spike could permanently cripple a node. Okay, not >>> permanently, but until a manual flush occurs. >>> >>> If anyone has any further thoughts, I'd love to hear them. I'm quite at >>> the end of my rope. >>> >>> Thanks in advance >>> >>> Robert >>> >>> From: Nate McCall <n...@thelastpickle.com> >>> Reply-To: <user@cassandra.apache.org> >>> Date: Saturday, February 1, 2014 at 9:25 AM >>> To: Cassandra Users <user@cassandra.apache.org> >>> Subject: Re: Lots of deletions results in death by GC >>> >>> What's the output of 'nodetool tpstats' while this is happening? >>> Specifically is Flush Writer "All time blocked" increasing? If so, play >>> around with turning up memtable_flush_writers and memtable_flush_queue_size >>> and see if that helps. >>> >>> >>> On Sat, Feb 1, 2014 at 9:03 AM, Robert Wille <rwi...@fold3.com> wrote: >>> >>>> A few days ago I posted about an issue I'm having where GC takes a long >>>> time (20-30 seconds), and it happens repeatedly and basically no work gets >>>> done. I've done further investigation, and I now believe that I know the >>>> cause. If I do a lot of deletes, it creates memory pressure until the >>>> memtables are flushed, but Cassandra doesn't flush them. If I manually >>>> flush, then life is good again (although that takes a very long time >>>> because of the GC issue). If I just leave the flushing to Cassandra, then I >>>> end up with death by GC. I believe that when the memtables are full of >>>> tombstones, Cassadnra doesn't realize how much memory the memtables are >>>> actually taking up, and so it doesn't proactively flush them in order to >>>> free up heap. >>>> >>>> As I was deleting records out of one of my tables, I was watching it >>>> via nodetool cfstats, and I found a very curious thing: >>>> >>>> Memtable cell count: 1285 >>>> Memtable data size, bytes: 0 >>>> Memtable switch count: 56 >>>> >>>> As the deletion process was chugging away, the memtable cell count >>>> increased, as expected, but the data size stayed at 0. No flushing >>>> occurred. >>>> >>>> Here's the schema for this table: >>>> >>>> CREATE TABLE bdn_index_pub ( >>>> >>>> tshard VARCHAR, >>>> >>>> pord INT, >>>> >>>> ord INT, >>>> >>>> hpath VARCHAR, >>>> >>>> page BIGINT, >>>> >>>> PRIMARY KEY (tshard, pord) >>>> >>>> ) WITH gc_grace_seconds = 0 AND compaction = { 'class' : >>>> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 160 }; >>>> >>>> I have a few tables that I run this cleaning process on, and not all of >>>> them exhibit this behavior. One of them reported an increasing number of >>>> bytes, as expected, and it also flushed as expected. Here's the schema for >>>> that table: >>>> >>>> >>>> CREATE TABLE bdn_index_child ( >>>> >>>> ptshard VARCHAR, >>>> >>>> ord INT, >>>> >>>> hpath VARCHAR, >>>> >>>> PRIMARY KEY (ptshard, ord) >>>> >>>> ) WITH gc_grace_seconds = 0 AND compaction = { 'class' : >>>> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 160 }; >>>> >>>> In both cases, I'm deleting the entire record (i.e. specifying just the >>>> first component of the primary key in the delete statement). Most records >>>> in bdn_index_pub have 10,000 rows per record. bdn_index_child usually has >>>> just a handful of rows, but a few records can have up 10,000. >>>> >>>> Still a further mystery, 1285 tombstones in the bdn_index_pub memtable >>>> doesn't seem like nearly enough to create a memory problem. Perhaps there >>>> are other flaws in the memory metering. Or perhaps there is some other >>>> issue that causes Cassandra to mismanage the heap when there are a lot of >>>> deletes. One other thought I had is that I page through these tables and >>>> clean them out as I go. Perhaps there is some interaction between the >>>> paging and the deleting that causes the GC problems and I should create a >>>> list of keys to delete and then delete them after I've finished reading the >>>> entire table. >>>> >>>> I reduced memtable_total_space_in_mb from the default (probably 2.7 GB) >>>> to 1 GB, in hopes that it would force Cassandra to flush tables before I >>>> ran into death by GC, but it didn't seem to help. >>>> >>>> I'm using Cassandra 2.0.4. >>>> >>>> Any insights would be greatly appreciated. I can't be the only one that >>>> has periodic delete-heavy workloads. Hopefully someone else has run into >>>> this and can give advice. >>>> >>>> Thanks >>>> >>>> Robert >>>> >>> >>> >>> >>> -- >>> ----------------- >>> Nate McCall >>> Austin, TX >>> @zznate >>> >>> Co-Founder & Sr. Technical Consultant >>> Apache Cassandra Consulting >>> http://www.thelastpickle.com >>> >> >> >