[moving to u...@]

0.6 fixes replaying faster than it can flush.

as for why it backs up in the first place before the restart, you can
either (a) throttle writes [set your timeout lower, make your clients
back off temporarily when it gets a timeoutexception] or (b) add
capacity.  (b) is recommended.

https://issues.apache.org/jira/browse/CASSANDRA-685 will mitigate this
but there is still no substitute for adding capacity to match demand.

On Tue, Apr 20, 2010 at 4:57 PM, Anthony Molinaro
<antho...@alumni.caltech.edu> wrote:
> Hi,
>
>  I have a cassandra cluster where a couple things are happening.  Every
> once in a while a node will start to get backed up.  Checking tpstats I
> see a very large value for ROW-MUTATION-STAGE.  Sometimes it will be able
> to clear it if I give it enough time, other times the vm OOMs.  With some
> nodes I also see this happen during restarts, I'll restart and have to
> wait 6-12 hours for the node to not be marked as 'Down'.
> I've seen
> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> and ended up with the following settings.
>
> KeysCachedFraction            : 0.01
> MemtableSizeInMB              : 100
> MemtableObjectCountInMillions : 0.5
> Heap                          : -Xmx5G
>
> I only have 2 CFs in this instance and entries are small so in most cases
> I hit MemtableObjectCountInMillions first and total MemtableSizeInMB is
> about 60MB-120MB for the 2 CFs combined.
>
> Anyone have any pointers on where to look next?  These are m1.large EC2
> instances (I want to move to xlarge to get more memory, but haven't yet
> gotten clarification on the best process for node replacement, per my
> other thread).
>
> Thanks,
>
> -Anthony
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <antho...@alumni.caltech.edu>
>

Reply via email to