Hi,

  I have a cassandra cluster where a couple things are happening.  Every
once in a while a node will start to get backed up.  Checking tpstats I
see a very large value for ROW-MUTATION-STAGE.  Sometimes it will be able
to clear it if I give it enough time, other times the vm OOMs.  With some
nodes I also see this happen during restarts, I'll restart and have to
wait 6-12 hours for the node to not be marked as 'Down'.
I've seen
http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
and ended up with the following settings.

KeysCachedFraction            : 0.01
MemtableSizeInMB              : 100
MemtableObjectCountInMillions : 0.5
Heap                          : -Xmx5G

I only have 2 CFs in this instance and entries are small so in most cases
I hit MemtableObjectCountInMillions first and total MemtableSizeInMB is
about 60MB-120MB for the 2 CFs combined.

Anyone have any pointers on where to look next?  These are m1.large EC2
instances (I want to move to xlarge to get more memory, but haven't yet
gotten clarification on the best process for node replacement, per my
other thread).

Thanks,

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <antho...@alumni.caltech.edu>

Reply via email to