On Wed, Apr 21, 2010 at 12:21:31PM -0500, Jonathan Ellis wrote:
> [moving to u...@]
> 
> 0.6 fixes replaying faster than it can flush.

Yeah, I noticed some of those fixes, and will probably take the leap into
0.6 if I can keep my cluster running (it's not doing too bad, I do about
400K reads and 250K writes per minute spread over 23 nodes), however some
of the m1.large instances get into this backed up state frequently. 
So I need to keep the cluster running first.

> as for why it backs up in the first place before the restart, you can
> either (a) throttle writes [set your timeout lower, make your clients
> back off temporarily when it gets a timeoutexception]

What timeout is this?  Something in the thrift API or a cassandra
configuration?

> or (b) add capacity.  (b) is recommended.

Yeah I've been doing that adding xlarge instances with raid0 disks which
work better, but I keep running into issues with the old instances which
hold up this work.  I'll keep chugging along and hopefully get things
sorted.

-Anthony

> 
> https://issues.apache.org/jira/browse/CASSANDRA-685 will mitigate this
> but there is still no substitute for adding capacity to match demand.
> 
> On Tue, Apr 20, 2010 at 4:57 PM, Anthony Molinaro
> <antho...@alumni.caltech.edu> wrote:
> > Hi,
> >
> >  I have a cassandra cluster where a couple things are happening.  Every
> > once in a while a node will start to get backed up.  Checking tpstats I
> > see a very large value for ROW-MUTATION-STAGE.  Sometimes it will be able
> > to clear it if I give it enough time, other times the vm OOMs.  With some
> > nodes I also see this happen during restarts, I'll restart and have to
> > wait 6-12 hours for the node to not be marked as 'Down'.
> > I've seen
> > http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> > and ended up with the following settings.
> >
> > KeysCachedFraction            : 0.01
> > MemtableSizeInMB              : 100
> > MemtableObjectCountInMillions : 0.5
> > Heap                          : -Xmx5G
> >
> > I only have 2 CFs in this instance and entries are small so in most cases
> > I hit MemtableObjectCountInMillions first and total MemtableSizeInMB is
> > about 60MB-120MB for the 2 CFs combined.
> >
> > Anyone have any pointers on where to look next?  These are m1.large EC2
> > instances (I want to move to xlarge to get more memory, but haven't yet
> > gotten clarification on the best process for node replacement, per my
> > other thread).
> >
> > Thanks,
> >
> > -Anthony
> >
> > --
> > ------------------------------------------------------------------------
> > Anthony Molinaro                           <antho...@alumni.caltech.edu>
> >

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <antho...@alumni.caltech.edu>

Reply via email to