Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

Aaron Morton Mon, 06 Dec 2010 12:35:19 -0800

Jake or anyone else got experience bulk loading into Lucandra ?

Or does anyone have experience with JRocket ?

Max, are you sending one document at a time into lucene. Can you send them in batches (like solr), if so does it reduce the

amount of requests going to cassandra?

Also, cassandra.bat is configured with XX:+HeapDumpOnOutOfMemoryError so you should be able to take a look at where all the memory if going. Riptano blog points to http://www.eclipse.org/mat/ also see http://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrr

Hope that helps.

Aaron

On 07 Dec, 2010,at 09:17 AM, Aaron Morton <aa...@thelastpickle.com> wrote:

Accidentally sent to me.

Begin forwarded message:
From: Max <cassan...@ajowa.de>
Date: 07 December 2010 6:00:36 AM
To: Aaron Morton <aa...@thelastpickle.com>
Subject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

Thank you both for your answer!
After several tests with different parameters we came to the
conclusion that it must be a bug.
It looks very similar to: https://issues.apache.org/jira/browse/CASSANDRA-1014

For both CFs we reduced thresholds:
- memtable_flush_after_mins = 60 (both CFs are used permanently,
therefore other thresholds should trigger first)
- memtable_throughput_in_mb = 40
- memtable_operations_in_millions = 0.3
- keys_cached = 0
- rows_cached = 0

- in_memory_compaction_limit_in_mb = 64

First we disabled caching, later we disabled compacting and after that we set
commitlog_sync: batch
commitlog_sync_batch_window_in_ms: 1

But our problem still appears:
During inserting files with Lucandra memory usage is slowly growing
until OOM crash after about 50 min.
@Peter: In our latest test we stopped writing suddenly but cassandra
didn\'t relax and remains even after minutes on ~90% heap usage.
http://oi54.tinypic.com/2dueeix.jpg

With our heap calculation we should need:
64 MB * 2 * 3 + 1 GB = 1,4 GB
All recent tests we run with 3 GB. I think that should be ok for a
test machine.
Also consistency level is one.

But Aaron is right, Lucandra produces even more than 200 inserts/s.
My 200 documents per second are about 200 operations (writecount) on
first CF and about 3000 on second CF.

But even with about 120 documents/s cassandra crashes.

Disk I/O monitored with Windows performance admin tools is on both
discs moderate (commitlog is on seperate harddisc).

Any ideas?
If it's really a bug, in my opinion it's very critical.

Aaron Morton <aa...@thelastpickle.com> wrote:

> I remember you have 2 CF's but what are the settings for:
>
> - memtable_flush_after_mins
> - memtable_throughput_in_mb
> - memtable_operations_in_millions
> - keys_cached
> - rows_cached
>
> - in_memory_compaction_limit_in_mb
>
> Can you do the JVM Heap Calculation here and see what it says
> http://wiki.apache.org/cassandra/MemtableThresholds
>
> What Consistency Level are you writing at? (Checking it's not Zero)
>
> When you talk about 200 inserts per second is that storing 200
> documents through lucandra or 200 request to cassandra. If it's the
> first option I would assume that would generate a lot more actual
> requests into cassandra. Open up jconsole and take a look at the
> WriteCount settings for the
> CF's http://wiki.apache.org/cassandra/MemtableThresholds
>
> You could also try setting the compaction thresholds to 0 to disable
> compaction while you are pushing this data in. Then use node tool to
> compact and turn the settings back to normal. See cassandra.yam for
> more info.
>
> I would have thought you could get the writes through with the setup
> you've described so far (even though a single 32bit node is unusual).
> The best advice is to turn all the settings down (e.g. caches off,
> mtable flush 64MB, compaction disabled) and if it still fails try:
>
> - checking your IO stats, not sure on windows but JConsole has some IO
> stats. If your IO cannot keep up then your server is not fast enough
> for your client load.
> - reducing the client load
>
> Hope that helps.
> Aaron
>
>
> On 04 Dec, 2010,at 05:23 AM, Max <cassan...@ajowa.de> wrote:
>
> Hi,
>
> we increased heap space to 3 GB (with JRocket VM under 32-bit Win with
> 4 GB RAM)
> but under "heavy" inserts Cassandra is still crashing with OutOfMemory
> error after a GC storm
>
> It sounds very similar to
> https://issues.apache.org/jira/browse/CASSANDRA-1177
>
> In our insert-tests the average heap usage is slowly growing up to the
> 3 GB border (jconsole monitor over 50 min
> http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is
> also constantly growing up to about 50 jobs pending
>
> We tried to decrease CF memtable threshold but after about half a
> million inserts it's over.
>
> - Cassandra 0.7.0 beta 3
> - Single Node
> - about 200 inserts/s ~500byte - 1 kb
>
>
> Is there no other possibility instead of slowing down inserts/s ?
>
> What could be an indicator to see if a node works stable with this
> amount of inserts?
>
> Thank you for your answer,
> Max
>
>
> Aaron Morton <aa...@thelastpickle.com>:
>
>> Sounds like you need to increase the Heap size and/or reduce the
>> memtable_throughput_in_mb and/or turn off the internal caches.
>> Normally the binary memtable thresholds only apply to bulk load
>> operations and it's the per CF memtable_* settings you want to
>> change. I'm not familiar with lucandra though.
>>
>> See the section on JVM Heap Size here
>> http://wiki.apache.org/cassandra/MemtableThresholds
>>
>> Bottom line is you will need more JVM heap memory.
>>
>> Hope that helps.
>> Aaron
>>
>> On 29 Nov, 2010,at 10:28 PM, cassan...@ajowa.de wrote:
>>
>> Hi community,
>>
>> during my tests i had several OOM crashes.
>> Getting some hints to find out the problem would be nice.
>>
>> First cassandra crashes after about 45 min insert test script.
>> During the following tests time to OOM was shorter until it started to crash
>> even in "idle" mode.
>>
>> Here the facts:
>> - cassandra 0.7 beta 3
>> - using lucandra to index about 3 million files ~1kb data
>> - inserting with one client to one cassandra node with about 200 files/s
>> - cassandra data files for this keyspace grow up to about 20 GB
>> - the keyspace only contains the two lucandra specific CFs
>>
>> Cluster:
>> - cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM
>> - java jre 1.6.0_22
>> - heap space first 1GB, later increased to 1,3 GB
>>
>> Cassandra.yaml:
>> default + reduced "binary_memtable_throughput_in_mb" to 128
>>
>> CFs:
>> default + reduced
>> min_compaction_threshold: 4
>> max_compaction_threshold: 8
>>
>>
>> I think the problem appears always during compaction,
>> and perhaps it is a result of large rows (some about 170mb).
>>
>> Are there more options we could use to work with few memory?
>>
>> Is it a problem of compaction?
>> And how to avoid?
>> Slower inserts? More memory?
>> Even fewer memtable_throuput or in_memory_compaction_limit?
>> Continuous manual major comapction?
>>
>> I've read
>> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
>> - row_size should be fixed since 0.7 and 200mb is still far away from 2gb
>> - only key cache is used a little bit 3600/20000
>> - after a lot of writes cassandra crashes even in idle mode
>> - memtablesize was reduced and there are only 2 CFs
>>
>> Several heapdumps in MAT show 60-99% heapusage of compaction thread.

Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

Reply via email to