As far as i can see, Lucandra already uses batch_mutations.
https://github.com/tjake/Lucandra/blob/master/src/lucandra/IndexWriter.java#L263
https://github.com/tjake/Lucandra/blob/master/src/lucandra/CassandraUtils.java#L371
IndexWriter.addDocument() merges all fields to a mutioation map.
In addition instead of "autoCommit" (commit each doc), i commit only
every 10 documents. Where can i monitor incoming requests to cassandra?
WriteCount and MutationCount (monitored by jconsole) didn't change obviously.
I had problems to open the jrockit heapdump with MAT, but found
"jrockit mission control" instead. Unfortunately i'm not confident
using it.
Here my observations:
While heapByteBuffer was growing (~200mb) and flushed during client
insert the byte[] was growing permanetly.
http://oi51.tinypic.com/2uhbdp3.jpg
I used TypeGraph to analyze the byte[] but i'm not sure how to interpret:
http://oi53.tinypic.com/y2d1i.jpg
Thank you!
Max
Aaron Morton <aa...@thelastpickle.com> wrote:
Jake or anyone else got experience bulk loading into Lucandra ?
Or does anyone have experience with JRocket ?
Max, are you sending one document at a time into lucene. Can you
send them in batches (like solr), if so does it reduce the
amount of requests going to cassandra?
Also, cassandra.bat is configured
with XX:+HeapDumpOnOutOfMemoryError so you should be able to take a
look at where all the memory if going. Riptano blog points
to http://www.eclipse.org/mat/ also
see http://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrr
Hope that helps.
Aaron
On 07 Dec, 2010,at 09:17 AM, Aaron Morton <aa...@thelastpickle.com> wrote:
Accidentally sent to me.
Begin forwarded message:
From: Max <cassan...@ajowa.de>
Date: 07 December 2010 6:00:36 AM
To: Aaron Morton <aa...@thelastpickle.com>
Subject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)
Thank you both for your answer!
After several tests with different parameters we came to the conclusion
that it must be a bug.
It looks very similar to:
https://issues.apache.org/jira/browse/CASSANDRA-1014
For both CFs we reduced thresholds:
- memtable_flush_after_mins = 60 (both CFs are used permanently,
therefore other thresholds should trigger first)
- memtable_throughput_in_mb = 40
- memtable_operations_in_millions = 0.3
- keys_cached = 0
- rows_cached = 0
- in_memory_compaction_limit_in_mb = 64
First we disabled caching, later we disabled compacting and after that we set
commitlog_sync: batch
commitlog_sync_batch_window_in_ms: 1
But our problem still appears:
During inserting files with Lucandra memory usage is slowly growing
until OOM crash after about 50 min.
@Peter: In our latest test we stopped writing suddenly but cassandra
didn\'t relax and remains even after minutes on ~90% heap usage.
http://oi54.tinypic.com/2dueeix.jpg
With our heap calculation we should need:
64 MB * 2 * 3 + 1 GB = 1,4 GB
All recent tests we run with 3 GB. I think that should be ok for a test
machine.
Also consistency level is one.
But Aaron is right, Lucandra produces even more than 200 inserts/s.
My 200 documents per second are about 200 operations (writecount) on
first CF and about 3000 on second CF.
But even with about 120 documents/s cassandra crashes.
Disk I/O monitored with Windows performance admin tools is on both
discs moderate (commitlog is on seperate harddisc).
Any ideas?
If it's really a bug, in my opinion it's very critical.
Aaron Morton <aa...@thelastpickle.com> wrote:
I remember you have 2 CF's but what are the settings for:
- memtable_flush_after_mins
- memtable_throughput_in_mb
- memtable_operations_in_millions
- keys_cached
- rows_cached
- in_memory_compaction_limit_in_mb
Can you do the JVM Heap Calculation here and see what it says
http://wiki.apache.org/cassandra/MemtableThresholds
What Consistency Level are you writing at? (Checking it's not Zero)
When you talk about 200 inserts per second is that storing 200
documents through lucandra or 200 request to cassandra. If it's the
first option I would assume that would generate a lot more actual
requests into cassandra. Open up jconsole and take a look at the
WriteCount settings for the
CF's http://wikiapache.org/cassandra/MemtableThresholds
You could also try setting the compaction thresholds to 0 to disable
compaction while you are pushing this data in. Then use node tool to
compact and turn the settings back to normal. See cassandra.yam for
more info.
I would have thought you could get the writes through with the setup
you've described so far (even though a single 32bit node is unusual).
The best advice is to turn all the settings down (e.g. caches off,
mtable flush 64MB, compaction disabled) and if it still fails try:
- checking your IO stats, not sure on windows but JConsole has some IO
stats. If your IO cannot keep up then your server is not fast enough
for your client load.
- reducing the client load
Hope that helps.
Aaron
On 04 Dec, 2010,at 05:23 AM, Max <cassan...@ajowa.de> wrote:
Hi,
we increased heap space to 3 GB (with JRocket VM under 32-bit Win with
4 GB RAM)
but under "heavy" inserts Cassandra is still crashing with OutOfMemory
error after a GC storm.
It sounds very similar to
https://issues.apache.org/jira/browse/CASSANDRA-1177
In our insert-tests the average heap usage is slowly growing up to the
3 GB border (jconsole monitor over 50 min
http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is
also constantly growing up to about 50 jobs pending
We tried to decrease CF memtable threshold but after about half a
million inserts it's over.
- Cassandra 0.7.0 beta 3
- Single Node
- about 200 inserts/s ~500byte - 1 kb
Is there no other possibility instead of slowing down inserts/s ?
What could be an indicator to see if a node works stable with this
amount of inserts?
Thank you for your answer,
Max
Aaron Morton <aa...@thelastpickle.com>:
Sounds like you need to increase the Heap size and/or reduce the
memtable_throughput_in_mb and/or turn off the internal caches.
Normally the binary memtable thresholds only apply to bulk load
operations and it's the per CF memtable_* settings you want to
change. I'm not familiar with lucandra though.
See the section on JVM Heap Size here
http://wiki.apache.org/cassandra/MemtableThresholds
Bottom line is you will need more JVM heap memory.
Hope that helps.
Aaron
On 29 Nov, 2010,at 10:28 PM, cassan...@ajowa.de wrote:
Hi community,
during my tests i had several OOM crashes.
Getting some hints to find out the problem would be nice.
First cassandra crashes after about 45 min insert test script.
During the following tests time to OOM was shorter until it
started to crash
even in "idle" mode.
Here the facts:
- cassandra 0.7 beta 3
- using lucandra to index about 3 million files ~1kb data
- inserting with one client to one cassandra node with about 200 files/s
- cassandra data files for this keyspace grow up to about 20 GB
- the keyspace only contains the two lucandra specific CFs
Cluster:
- cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM
- java jre 1.6.0_22
- heap space first 1GB, later increased to 1,3 GB
Cassandra.yaml:
default + reduced "binary_memtable_throughput_in_mb" to 128
CFs:
default + reduced
min_compaction_threshold: 4
max_compaction_threshold: 8
I think the problem appears always during compaction,
and perhaps it is a result of large rows (some about 170mb).
Are there more options we could use to work with few memory?
Is it a problem of compaction?
And how to avoid?
Slower inserts? More memory?
Even fewer memtable_throuput or in_memory_compaction_limit?
Continuous manual major comapction?
I've read
http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
- row_size should be fixed since 0.7 and 200mb is still far away from 2gb
- only key cache is used a little bit 3600/20000
- after a lot of writes cassandra crashes even in idle mode
- memtablesize was reduced and there are only 2 CFs
Several heapdumps in MAT show 60-99% heapusage of compaction thread.